RE: The book: Solr 4.x Deep Dive - Early Access Release #1
I'd be willing to pay $30-$40 initial cost, but would expect to be able to get the revisions for no cost then. With the revision model, I'd pay the initial $10 and then $3-$9 per revision (depending on what's in the revision). It's an interesting model then, because I can choose to not purchase sections/revisions that I am not interested in. Thanks for doing this, Jack. Very excited! Swati -Original Message- From: Stevo Slavić [mailto:ssla...@gmail.com] Sent: Friday, June 21, 2013 9:39 AM To: solr-user Subject: Re: The book: Solr 4.x Deep Dive - Early Access Release #1 Consider https://leanpub.com/ for publishing. I'm in no way affiliated with them, just have positive personal buying experience. One can and I regularly give more than what author requested as min price. For such a work of 1k pages, would definitely pay more than $10. Kind regards, Stevo Slavic. On Fri, Jun 21, 2013 at 3:32 PM, AJ Weber awe...@comcast.net wrote: On 6/21/2013 9:22 AM, Alexandre Rafalovitch wrote: I might be however confused regarding your strategy. I thought you were going to do several different volumes, rather than one large one. Or is this all a 'first' volume discussion so far. Pricing: $7.99 feels better for the book this size. Under $5 it feels like it may be mostly filler (even if it is not). I don't think anybody will pay every month just because it got updated. I agree that I'm a little confused as to the pricing. Are you saying you'll keep updating it and everyone would just d/l the latest version monthly? If so, what's to stop someone from waiting to subscribe until it is entirely complete and just pay the $8 once for the whole thing -- versus those of us (me included) who would be sending our $8 every month and therefore receiving the same work at 10x the price (for example)? I'm with one of the previous responses: I'd be willing to pay $30 for early-access (and updates) to an eBook as a one-time-cost and then when you release the final, set it at $40 or more.
RE: Issues in the Fuzzy Query !
Hello, Can you share the exact params you are passing to solr? Thanks From: vibhoreng04 [vibhoren...@gmail.com] Sent: June 21, 2013 9:27 AM To: solr-user@lucene.apache.org Subject: Issues in the Fuzzy Query ! Hi All, I have been facing problems in the fuzzy queries.For an example if I query ((FIELDNAME1:FRANK~0.80) AND (FIELDNAME1:INDIANO~0.80))^0.80 , the parsed query changes my distance grade to ~0 etc.Also in the other cases the distance is changed in the pared query like ~1 and ~2.Can anyone tell me what is the issue here .If there is any issue in the way I am querying ,I would love to hear that. str name=rawquerystring ((FIELDNAME1:FRANK~0.80) AND (FIELDNAME1:INDIANO~0.80))^0.80/str str name=querystring ((FIELDNAME1:FRANK~0.80) AND (FIELDNAME1:INDIANO~0.80))^0.80/str str name=parsedquery(+((+FIELDNAME1:FRANK~0 +FIELDNAME1:INDIANO~2)^0.8))/no_coord/str str name=parsedquery_toString+((+FIELDNAME1:FRANK~0 +FIELDNAME1:INDIANO~2)^0.8)/str lst name=explain/ str name=QParserExtendedDismaxQParser/str null name=altquerystring/ -- View this message in context: http://lucene.472066.n3.nabble.com/Issues-in-the-Fuzzy-Query-tp4072125.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Steps for creating a custom query parser and search component
Hi Juha, If it's just a matter of format, have you considered adding another layer between Solr where you've got a class that just takes in your queries in the proprietary format and then converts them to what Solr needs? Similarly, if you need your results in a format, just convert them again? I would imagine that'd be a lot simpler than subclassing Solr classes. Swati -Original Message- From: Juha Haaga [mailto:juha.ha...@codenomicon.com] Sent: Thursday, June 20, 2013 9:33 AM To: solr-user@lucene.apache.org Subject: Steps for creating a custom query parser and search component Hello list followers, I need to write a custom Solr query parser and a search component. The requirements for the component are that the raw query that may need to be split into separate Solr queries is in a proprietary format encoded in JSON, and the output is also going to be in a similar proprietary JSON format. I would like some advice on how to get started. Which base classes should I start to work with? I have been looking at the plugin classes and my initial thoughts are along the lines of following workflow: 1. Subclass (QParser?) and write a new parser method that knows how to deal with the input format. 2. Subclass (SolrQueryRequestBase?) or use LocalSolrQueryRequest like in the TestHarness.makeRequest() and use it to execute the required queries. 3. Compile the aggregate results as specified in the query. 4. Use some existing component (?) for returning the results to the user. 5. Put these components in steps 1-4 together into (?) so that it can be added to solrconfig.xml as a custom query parser accessible at http://solr/core/customparser Is my approach reasonable, or am I overlooking some canonical way of achieving what I need to do? What and where do I need to look into to replace the question marks in my plan with knowledge? :) -- Juha
RE: Question about SOLR search relevance score
Hi Sergio, Append 'debugQuery=on' to your queries to learn more about how your queries are being evaluated/ranked. i.e. qf=attributes_name^15+attributes_brand^10+attributes_category^8debugQuery=on You'll get an XML section that is dedicated to debug information. I've found http://explain.solr.pl/ useful in understanding and visualizing the debug output. Swati -Original Message- From: sérgio Alves [mailto:sd_t_al...@hotmail.com] Sent: Wednesday, June 19, 2013 11:45 AM To: solr-user@lucene.apache.org Subject: Question about SOLR search relevance score Hi. My name is Sérgio Alves and I'm a developer in a project that uses solr as its search engine. Right now we're having problems with some common search terms. They return varied results on the search results, and the products which should appear first in the results, are scored lower than other, seemingly unrelated, products. I wanted to know if there is a parameter or any possible way for me to know the way that solr calculates the scores it returns. For example, if we had a search relevancy formula like QF=attributes_name^15+attributes_brand^10+attributes_category^8, how can I know that brand scored 'x', for name 'y' and category 'z'. Is that possible? How can I do that? This is urgent, if someone could take the time and answer this topic to me in a quick manner, I would really appreciate it. Thank you very much for the attention, best regards, Sérgio Alves
RE: Note on The Book
I'd definitely prefer the spiral bound as well. E-books are great and your draft version seems very reasonably priced (aka I would definitely get it). Really looking forward to this. Is there a separate mailing list / etc. for the book for those who would like to receive updates on the status of the book? Thanks Swati Swoboda Software Developer - Igloo Software +1.519.489.4120 sswob...@igloosoftware.com Bring back Cake Fridays – watch a video you’ll actually like http://vimeo.com/64886237 -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Thursday, May 23, 2013 7:15 PM To: solr-user@lucene.apache.org Subject: Note on The Book To those of you who may have heard about the Lucene/Solr book that I and two others are writing on Lucene and Solr, some bad and good news. The bad news: The book contract with O’Reilly has been canceled. The good news: I’m going to proceed with self-publishing (possibly on Lulu or even Amazon) a somewhat reduced scope Solr-only Reference Guide (with hints of Lucene). The scope of the previous effort was too great, even for O’Reilly – a book larger than 800 pages (or even 600) that was heavy on reference and lighter on “guide” just wasn’t fitting in with their traditional “guide” model. In truth, Solr is just too complex for a simple guide that covers it all, let alone Lucene as well. I’ll announce more details in the coming weeks, but I expect to publish an e-book-only version of the book, focused on Solr reference (and plenty of guide as well), possibly on Lulu, plus eventually publish 4-8 individual print volumes for people who really want the paper. One model I may pursue is to offer the current, incomplete, raw, rough, draft as a $7.99 e-book, with the promise of updates every two weeks or a month as new and revised content and new releases of Solr become available. Maybe the individual e-book volumes would be $2 or $3. These are just preliminary ideas. Feel free to let me know what seems reasonable or excessive. For paper: Do people really want perfect bound, or would you prefer spiral bound that lies flat and folds back easily? I suppose we could offer both – which should be considered “premium”? I’ll announce more details next week. The immediate goal will be to get the “raw rough draft” available to everyone ASAP. For those of you who have been early reviewers – your effort will not have been in vain. I have all your comments and will address them over the next month or two or three. Just for some clarity, the existing Solr Wiki and even the recent contribution of the LucidWorks Solr Reference to Apache really are still great contributions to general knowledge about Solr, but the book is intended to go much deeper into detail, especially with loads of examples and a lot more narrative guide. For example, the book has a complete list of the analyzer filters, each with a clean one-liner description. Ditto for every parameter (although I would note that the LucidWorks Solr Reference does a decent job of that as well.) Maybe, eventually, everything in the book COULD (and will) be integrated into the standard Solr doc, but until then, a single, integrated reference really is sorely needed. And, the book has a lot of narrative guide and walking through examples as well. Over time, I’m sure both will evolve. And just to be clear, the book is not a simple repurposing of the Solr wiki content – EVERY description of everything has been written fresh, from scratch. So, for example, analyzer filters get both short one-liner summary descriptions as well as more detailed descriptions, plus formal attribute specifications and numerous examples, including sample input and outputs (the LucidWorks Solr Reference does a better job with examples as well.) The book has been written in parallel with branch_4x and that will continue. -- Jack Krupansky
RE: Score after boost before
http://explain.solr.pl/ might help you out with parsing out the response to see how boosts are affecting the scores. Take a look at some of the history/examples: http://explain.solr.pl/explains/7kjl0ids -Original Message- From: abhayd [mailto:ajdabhol...@hotmail.com] Sent: Friday, April 05, 2013 12:37 PM To: solr-user@lucene.apache.org Subject: Re: Score after boost before we do that now, but thats very time consuming. Also we want our QA to have that info available on search result page in debug mode. -- View this message in context: http://lucene.472066.n3.nabble.com/Score-after-boost-before-tp4054052p4054102.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How does solr 4.2 do in returning large datasets ?
It really depends on what you are returning (how big is each document? Just a document ID? Pages and pages of data in fields?). It can take a long time for Solr to render an XML with 60,000 results. Solr will be serializing the data and then you'd (presumably) be de-serializing it. Depending on how big each field actually is, this could take a while or even cause DOS on your server. Your client would also need a fair bit of memory to parse a document with 60,000 results -Original Message- From: Liz Sommers [mailto:lizswo...@gmail.com] Sent: Monday, April 01, 2013 9:39 AM To: solr-user Subject: How does solr 4.2 do in returning large datasets ? I thought I remembered reading that Solr is not good for returning large datasets. We are currently using lucene 3.6.0 and returning datasets of 10,000 to 60,000 results. In the future we might need to return even larger datasets. Would you all recommend going to Solr for this, or should we stick with Lucene (which has given us no problems in this regard)? I am a bit wary of using a web service to return datasets of this size. Thanks a lot Liz lizswo...@gmail.com
RE: Is deltaQuery mandatory ?
No, it's not mandatory. You can't do delta imports without delta queries though; you'd need to do a full-import. Per your query, you'd only ever do objects with rownum=5000. -Original Message- From: A. Lotfi [mailto:majidna...@yahoo.com] Sent: Thursday, March 28, 2013 10:07 AM To: gene...@lucene.apache.org; solr-user@lucene.apache.org Subject: Is deltaQuery mandatory ? Is deltaQuery mandatory in data-config.xml ? I did it like this : entity name=residential query=select * from tsunami.consumer_data_01 where state='MA' and rownum = 5000 deltaQuery=select LEMSMATCHCODE, STREETNAME from residential where last_modified '${dataimporter.last_index_time}' Then my manager come and said we don't need it, this is only for incremental. I took off the line that start with deltaQuery, now in : http://localhost:8983/solr/#/db/dataimport//dataimport entity is empty, when I click the button Exwcute, nothing happened, thanks.
RE: SOLR - Unable to execute query error - DIH
What version of Solr4 are you running? We are on 3.6.2 so I can't be confident whether these settings still exist (they probably do...), but here is what we do to speed up full-indexing: In solrconfig.xml, increase your ramBufferSize to 128MB. Increase mergeFactor to 20. Make sure autoCommit is disabled. Basically, you want to minimize how often Lucene/Solr flushes (as that is very time consuming). Merging is also very time consuming, so you want large segments and fewer merges (hence the merge factor increase). We use these settings when we are doing our initial full-indexing and then switch them over to saner defaults do our regular/delta indexing. Roll-backs concern me; why did your query roll back? Did it give an error -- it should have. Should be in your solr log file. Was it because the connection timed out? It's important to find out. We prevented roll backs by effectively splitting our data across entities and then indexing one-entity at a time. This allowed us to make sure that if one sector failed, it didn't impact the entire process. (This can be done by using autoCommit, but that slows down indexing.) If you're getting OOM errors, be sure that your Xmx value is set high enough (and that you have enough memory). You may be able to increase ramBufferSize depending on how much memory you had (we didn't have much). Hope this helps. Swati -Original Message- From: kobe.free.wo...@gmail.com [mailto:kobe.free.wo...@gmail.com] Sent: Thursday, March 28, 2013 2:43 AM To: solr-user@lucene.apache.org Subject: RE: SOLR - Unable to execute query error - DIH Thanks James. We have tried the following options *(individually)* including the one you suggested, 1.selectMethod=cursor 2. batchSize=-1 3.responseBuffering=adaptive But the indexing process doesn't seem to be improving at all. When we try to index set of 500 rows it works well gets completed in 18 min. For 1000K rows it took 22 hours (long) for indexing. But, when we try to index the complete set of 750K rows it doesn't show any progress and keeps on executing. Currently both the SQL server as well as the SOLR machine is running on 4 GB RAM. With this configuration does the above scenario stands justified? If we think of upgrading the RAM, which machine should that be, the SOLR machine or the SQL Server machine? Are there any other efficient methods to import/ index data from SQL Server to SOLR? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028p4051981.html Sent from the Solr - User mailing list archive at Nabble.com.
Contributors Group
Hello, Can I be added to the contributors group? Username sswoboda. Thank you. Swati
RE: Get page number of searchresult of a pdf in solr
You can get the paragraph of the search result via highlights. You'd have to mark your field as stored (re-indexing required) and then specify it in the highlighting parameters. http://wiki.apache.org/solr/HighlightingParameters#hl As for getting the page number, I am not sure if there is more you can do than what Michael suggested... -Original Message- From: d...@geschan.de [mailto:d...@geschan.de] Sent: Thursday, February 28, 2013 3:27 PM To: solr-user@lucene.apache.org Subject: Get page number of searchresult of a pdf in solr Hello, I'm building a web application where users can search for pdf documents and view them with pdf.js. I would like to display the search results with a short snippet of the paragraph where the search term where found and a link to open the document at the right page. So what I need is the page number and a short text snippet of every search result. I'm using SOLR 4.1 for indexing pdf documents. The indexing itself works fine but I don't know how to get the page number and paragraph of a search result. I only get the document where the search term was found in. -Gesh
RE: POI error while extracting docx document
Hey Carlos, What version of Solr are you running and what version of openxml4j did you import? Swati -Original Message- From: Carlos Alexandro Becker [mailto:caarl...@gmail.com] Sent: Tuesday, February 26, 2013 12:04 PM To: solr-user Subject: Re: POI error while extracting docx document I've added the openxml4j jar to the project, still don't work. Which is the correct version? On Tue, Feb 26, 2013 at 11:23 AM, Carlos Alexandro Becker caarl...@gmail.com wrote: I made solr extract the files content. That's ok, but some files (like .docx files) give me errors, while .pdf files index as expected. The error is: 14:20:29,714 ERROR [org.apache.solr.servlet.SolrDispatchFilter] (http--0.0.0.0-8080-4) null:java.lang.RuntimeException: java.lang.NoSuchMethodError: org.apache.poi.openxml4j.opc.PackagePart.getRelatedPart(Lorg/apache/po i/openxml4j/opc/PackageRelationship;)Lorg/apache/poi/openxml4j/opc/Pac kagePart; at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilte r.java:469) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter .java:297) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli cationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi lterChain.java:248) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa lve.java:275) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextVa lve.java:161) at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(Secur ityContextAssociationValve.java:153) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.ja va:155) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.ja va:102) at org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.jav a:397) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValv e.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java :368) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java: 877) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proces s(Http11Protocol.java:671) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930 ) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.NoSuchMethodError: org.apache.poi.openxml4j.opc.PackagePart.getRelatedPart(Lorg/apache/po i/openxml4j/opc/PackageRelationship;)Lorg/apache/poi/openxml4j/opc/Pac kagePart; at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEm beddedParts(AbstractOOXMLExtractor.java:121) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML (AbstractOOXMLExtractor.java:107) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOX MLExtractorFactory.java:112) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.j ava:82) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:12 0) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extra ctingDocumentLoader.java:219) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Con tentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle rBase.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleR equest(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter. java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter .java:276) ... 14 more Looks like a dependency issue. What dependency should I add to fix this? -- Atenciosamente, *Carlos Alexandro Becker* http://caarlos0.github.com/about -- Atenciosamente, *Carlos Alexandro Becker* http://caarlos0.github.com/about
RE: User Query Processing Sanity Check
Maybe I am not understanding correctly, but have you overlooked the qf parameter for Edismax? http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29 Suppose you want to search for the phrase apples and bananas in title, summary, and body. You also want it to have greater emphasis when the search term is found in title and description. The way you would do it is: q = apples and bananas qf = title^100 content description^10 That's it. Now it'll search for apples and bananas in all 3 fields. Edismax was basically designed to do...what you want to do. You'll probably also find the mm parameter and the pf parameters immensely useful. -Original Message- From: z...@navigo.com [mailto:z...@navigo.com] Sent: Monday, February 25, 2013 12:06 PM To: solr-user@lucene.apache.org Subject: User Query Processing Sanity Check Have been working with Solr for about 6 months, straightforward stuff, basic keyword searches. We want to move to more advanced stuff, to support 'must include', 'must not include', set union, etc. I.e., more advanced query strings. We seem to have hit a block, and are considering two paths and want to make sure we have the right understanding before wasting time. To wit: - We have many fields to search, fieldA, fieldB, fieldC, etc. - We need field level boosting, fieldA fieldB fieldC, etc. - We're happy to use EDisMax query syntax: , +, -, OR, AND, (), and field:term superficial syntax. Passing the query straight through doesn't seem work because foo bar fieldB:baz searches foo and bar in the default field only, but we want to search multiple fields. The trick of copying multiple fields into a single artificial default field seems to fail on the second requirement. So, we end up parsing the Lucene syntax ourselves, and rebuilding the query my multiplying the fields so that: foo bar fieldB:baz - (fieldA:foo OR fieldB:foo OR fieldC:foo) AND (fieldA:bar OR fieldB:bar OR fieldC:bar) AND (fieldB:baz) Technically, this is straightforward enough, but it seems a shame since the EDisMax query parser seems like it's *almost* what we want, if it weren't for the reality of the singular default field. Are we correct to build our own mini-parser that takes query strings and multiplies the fields for free-field sub-predicates? Or is there a simpler path that we're overlooking? Regards, Zane -- View this message in context: http://lucene.472066.n3.nabble.com/User-Query-Processing-Sanity-Check-tp4042783.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Index data from multiple tables into Solr
He is talking about this list, the list we are using to communicate. You are sending your messages to a mailing list -- thousands are on it. Example of programs that will run the delta-import/full-import commands: Cron You are basically calling a URL with specific parameters to pull data from your DB Example of program that will use the Solr API: these are all application specific (based on what fields are in your schema, etc.). Swati -Original Message- From: hassancrowdc [mailto:hassancrowdc...@gmail.com] Sent: Tuesday, January 15, 2013 2:00 PM To: solr-user@lucene.apache.org Subject: Re: Index data from multiple tables into Solr Which list are you reffering to? and can you please give an example of such program(doesn't matter if it is for your setup)? On Tue, Jan 15, 2013 at 12:06 PM, Shawn Heisey-4 [via Lucene] ml-node+s472066n4033518...@n3.nabble.com wrote: On 1/15/2013 9:20 AM, hassancrowdc wrote: Hi, once i have indexed data from multiple tables from mysql database into solr, is there any way that it update data(automatically) if any change is made to the data in mysql? You need to write a program to do this. Although this list can provide guidance, such programs are highly customized to the particulars for your setup. There is not really any general purpose solution here. There are two typical approaches - have a program that initiates delta-imports with the dataimporter, or write a program that both talks to your database and uses a Solr client API to send updates to Solr. I used to use the former approach, now I use the latter. I still use the dataimporter for full reindexes, though. Thanks, Shawn -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int o-Solr-tp4032266p4033518.html To unsubscribe from Index data from multiple tables into Solr, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN vbXw0MDMyMjY2fC00ODMwNzMyOTM= . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em ail%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033545.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Index data from multiple tables into Solr
https://wiki.apache.org/solr/Solrj client. You'd have to configure it / use it based on your application needs. -Original Message- From: hassancrowdc [mailto:hassancrowdc...@gmail.com] Sent: Tuesday, January 15, 2013 2:38 PM To: solr-user@lucene.apache.org Subject: Re: Index data from multiple tables into Solr ok. so if i have manufacturer and id fields in schema file, what will be wat will be program that will use that will use solr API? -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033556.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Index data from multiple tables into Solr
What error are you getting? Which field are you searching (default field)? Did you try specifying a default field? What is your schema like? Which analyzers did you use? Which version of solr are you using? I highly recommend going through the tutorial to get a basic understanding of inserting, updating, and searching: http://lucene.apache.org/solr/tutorial.html Hours have been spent in setting up these tutorials and they are very informative. -Original Message- From: hassancrowdc [mailto:hassancrowdc...@gmail.com] Sent: Tuesday, January 15, 2013 3:38 PM To: solr-user@lucene.apache.org Subject: Re: Index data from multiple tables into Solr okay, thank you. After indexing data from database to solr. I want to search such that if i write any word (that is included in the documents been indexed) it should return all the documents that include that word. But it does not. When i write http://localhost:8983/solr/select?q=anyword i gives me error. is there anything wrong with my http? or is this the wrong place to search? On Tue, Jan 15, 2013 at 2:48 PM, sswoboda [via Lucene] ml-node+s472066n4033563...@n3.nabble.com wrote: https://wiki.apache.org/solr/Solrj client. You'd have to configure it / use it based on your application needs. -Original Message- From: hassancrowdc [mailto:[hidden email]http://user/SendEmail.jtp?type=nodenode=4033563i=0] Sent: Tuesday, January 15, 2013 2:38 PM To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4033563i=1 Subject: Re: Index data from multiple tables into Solr ok. so if i have manufacturer and id fields in schema file, what will be wat will be program that will use that will use solr API? -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int o-Solr-tp4032266p4033556.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int o-Solr-tp4032266p4033563.html To unsubscribe from Index data from multiple tables into Solr, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN vbXw0MDMyMjY2fC00ODMwNzMyOTM= . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em ail%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033614.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Index data from multiple tables into Solr
http://wiki.apache.org/solr/ExtendedDisMax Specify your query fields in the qf parameter. Take a look at the example at the bottom of the page. -Original Message- From: hassancrowdc [mailto:hassancrowdc...@gmail.com] Sent: Tuesday, January 15, 2013 3:56 PM To: solr-user@lucene.apache.org Subject: Re: Index data from multiple tables into Solr I dont want to search by one field, i want to search as a whole. I am following that tutorial i got indexing, updating but now for search i would like to search through everything i have indexed not a specific field. I can do by using defaultfield but i would like to search through everything i have indexed. any hint how i can do that? On Tue, Jan 15, 2013 at 3:49 PM, sswoboda [via Lucene] ml-node+s472066n4033617...@n3.nabble.com wrote: What error are you getting? Which field are you searching (default field)? Did you try specifying a default field? What is your schema like? Which analyzers did you use? Which version of solr are you using? I highly recommend going through the tutorial to get a basic understanding of inserting, updating, and searching: http://lucene.apache.org/solr/tutorial.html Hours have been spent in setting up these tutorials and they are very informative. -Original Message- From: hassancrowdc [mailto:[hidden email]http://user/SendEmail.jtp?type=nodenode=4033617i=0] Sent: Tuesday, January 15, 2013 3:38 PM To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4033617i=1 Subject: Re: Index data from multiple tables into Solr okay, thank you. After indexing data from database to solr. I want to search such that if i write any word (that is included in the documents been indexed) it should return all the documents that include that word. But it does not. When i write http://localhost:8983/solr/select?q=anyword i gives me error. is there anything wrong with my http? or is this the wrong place to search? On Tue, Jan 15, 2013 at 2:48 PM, sswoboda [via Lucene] [hidden email] http://user/SendEmail.jtp?type=nodenode=4033617i=2 wrote: https://wiki.apache.org/solr/Solrj client. You'd have to configure it / use it based on your application needs. -Original Message- From: hassancrowdc [mailto:[hidden email]http://user/SendEmail.jtp?type=nodenode=4033563i=0] Sent: Tuesday, January 15, 2013 2:38 PM To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4033563i=1 Subject: Re: Index data from multiple tables into Solr ok. so if i have manufacturer and id fields in schema file, what will be wat will be program that will use that will use solr API? -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-i nt o-Solr-tp4032266p4033556.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-i nt o-Solr-tp4032266p4033563.html To unsubscribe from Index data from multiple tables into Solr, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?mac ro =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsL mN vbXw0MDMyMjY2fC00ODMwNzMyOTM= . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?mac ro =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml .n amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-na bb le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21 na bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_ em ail%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int o-Solr-tp4032266p4033614.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-int o-Solr-tp4032266p4033617.html To unsubscribe from Index data from multiple tables into Solr, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =unsubscribe_by_codenode=4032266code=aGFzc2FuY3Jvd2RjYXJlQGdtYWlsLmN vbXw0MDMyMjY2fC00ODMwNzMyOTM= . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro =macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.n amespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabb le.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21na bble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_em ail%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Index-data-from-multiple-tables-into-Solr-tp4032266p4033622.html Sent from the
RE: Reading properties in data-import.xml
I am on 3.6 and this is my setup: Properties file under solr.home, so right under /jetty/solr solr.xml modified as follows: core name=corename instanceDir=instancedir properties=../solrcore.properties / http://wiki.apache.org/solr/CoreAdmin#property - the path is relative to instancedir Your syntax is correct in DIH, I think all you are missing is the reference to the property file in solr.xml. -Original Message- From: Dariusz Borowski [mailto:darius...@gmail.com] Sent: Thursday, January 10, 2013 10:38 AM To: solr-user@lucene.apache.org Subject: Re: Reading properties in data-import.xml Thanks Alexandre! I followed your example and created a solrcore.properties in solr.home/conf/solrcore.properties. I created a symlink in my core/conf to the solrcore.properties file, but I can't read the properties. My properties file: username=myusername password=mypassword My data-import.xml: dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://${host}:3306/projectX user=${username} password=${password} / Is the syntax correct? Best regards, Dariusz On Thu, Jan 10, 2013 at 3:21 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: dataimport.properties is for DIH to store it's own properties for delta processing and things. Try solrcore.properties instead, as per recent discussion: http://lucene.472066.n3.nabble.com/Reading-database-connection-propert ies-from-external-file-td4031154.html Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Jan 10, 2013 at 3:58 AM, Dariusz Borowski darius...@gmail.com wrote: I'm having a problem using a property file in my data-import.xml file. My aim is to not hard code some values inside my xml file, but rather reusing the values from a property file. I'm using multicore and some of the values are being changed from time to time and I do not want to change them in all my data-import files. For example: dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://${host}:3306/projectX user=${username} password=${password} / I tried everything, but don't know how I can use proporties here. I tried to put my values in dataimport.properties, located under SOLR-HOME/conf and under SOLR-HOME/core1/conf, but without any success. Please, could someone help me on this?
RE: Need help with delta import
If I am not mistaken, it's suppose to be dataimporter.delta.ID and dataimporter.last_index_time You are using dataimport.delta.ID and dataimport.last_index_time http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport -Original Message- From: umajava [mailto:umaj...@gmail.com] Sent: Thursday, December 13, 2012 9:35 PM To: solr-user@lucene.apache.org Subject: RE: Need help with delta import Thanks a lot for your reply. I have made the changes but it still does not work. I still get the same results. Any other suggestions please? -- View this message in context: http://lucene.472066.n3.nabble.com/Need-help-with-delta-import-tp4025003p4026910.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Need help with delta import
I am also confused, as I've been using dataimporter.* and not dih.* and it is working fine. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, December 14, 2012 2:41 PM To: solr-user@lucene.apache.org Subject: Re: Need help with delta import On 12/14/2012 11:39 AM, Dyer, James wrote: Try ${dih.delta.ID} instead of ${dataimporter.delta.id}. Also use ${dih.last_index_time} instead of ${dataimporter.last_index_time} . I noticed when updating the test cases that the wiki incorrectly used the longer name but with all the versions I tested this on only the short name works. The wiki has since been changed. James, I use DIH for full Solr reindexes. My dih config makes extensive use of ${dataimporter.request.XXX} variables for my own custom parameters. I am using branch_4x checked out yesterday on my dev machine, and I did a full reindex on that version, which worked. Three questions: 1) Should I be using ${dih.request.XXX} instead? 2) Is the longer syntax going away? 3) What issues and/or docs would be good reading material? Thanks, Shawn
RE: Can a field with defined synonym be searched without the synonym?
Query-time analyzers are still applied, even if you include a string in quotes. Would you expect foo to not match Foo just because it's enclosed in quotes? Also look at this, someone who had similar requirements: http://lucene.472066.n3.nabble.com/Synonym-Filter-disable-at-query-time-td2919876.html -Original Message- From: joe.cohe...@gmail.com [mailto:joe.cohe...@gmail.com] Sent: Wednesday, December 12, 2012 12:09 PM To: solr-user@lucene.apache.org Subject: Re: Can a field with defined synonym be searched without the synonym? I'm aplying only query-time synonym, so I have the original values stored and indexed. I would've expected that if I search a strin with quotations, i'll get the exact match, without applying a synonym. any way to achieve that? Upayavira wrote You can only search against terms that are stored in your index. If you have applied index time synonyms, you can't remove them at query time. You can, however, use copyField to clone an incoming field to another field that doesn't use synonyms, and search against that field instead. Upayavira On Wed, Dec 12, 2012, at 04:26 PM, joe.cohen.m@ wrote: Hi I hava a field type without defined synonym.txt which retrieves both records with home and house when I search either one of them. I want to be able to search this field on the specific value that I enter, without the synonym filter. is it possible? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-b e-searched-without-the-synonym-tp4026381.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-be-searched-without-the-synonym-tp4026381p4026405.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Boost docs which are posted recently
Hi Sangeetha, If you need to boost based on date regardless of type, just use date boosting with a higher boost: http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents http://wiki.apache.org/solr/FunctionQuery#Date_Boosting -Original Message- From: Sangeetha [mailto:sangeetha...@gmail.com] Sent: Tuesday, December 11, 2012 4:29 AM To: solr-user@lucene.apache.org Subject: Boost docs which are posted recently Hi, I have a doc with the field type_s. The value can be news, photos and videos. The priority will be given in this order, photos, videos then news using the below query, q=sachindefType=dismaxbq=type_s:photos^10bq=type_s:videos^7bq=type_s:news^5 eventhough it is giving more priority to photos, sometimes it needs to display the videos/news if it is posted recently. How can i achieve this? Is it possible to use single bq for multiple field using space or +? Thanks, Sangeetha -- View this message in context: http://lucene.472066.n3.nabble.com/Boost-docs-which-are-posted-recently-tp4025955.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Searching for phrase
It's because you are escaping. Look at this bit: [parsedquery_toString] = +(smsc_content:abcdefg12345 smsc_content:678910 smsc_description:abcdefg12345 smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] It's searching for as well because you escaped it (hence it is not considered 'special' anymore, which, in this case, you want). Just search for abcdefg12345 678910 without escaping the quotes. Your parsed query (in debug mode) should look something like this: str name=parsedqueryPhraseQuery(smsc_content:abcdefg12345 678910)/str str name=parsedquery_toStringsmsc_content:abcdefg12345 678910/str Swati -Original Message- From: Arkadi Colson [mailto:ark...@smartbit.be] Sent: Tuesday, December 11, 2012 10:36 AM To: solr solr-user@lucene.apache.org Subject: Searching for phrase Hi My schema looks like this: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt,stopwords_du.txt enablePositionIncrements=true/ filter class=solr.ShingleFilterFactory maxShingleSize=3 outputUnigrams=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/-- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt,stopwords_du.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType I inserted these 2 strings into solr: abcdefg12345 678910 abcdefg12345 xyz 678910 When searching for abcdefg12345 678910 with quotes I got no result. Without quotes both string are found. SolrObject Object ( [responseHeader] = SolrObject Object ( [status] = 0 [QTime] = 38 [params] = SolrObject Object ( [sort] = score desc [indent] = on [collection] = intradesk [wt] = xml [version] = 2.2 [rows] = 5 [debugQuery] = true [fl] = id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_content,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subject [start] = 0 [q] = (smsc_content:\abcdefg12345 678910\ || smsc_description:\abcdefg12345 678910\) (smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) (smsc_ssid:929) ) ) [response] = SolrObject Object ( [numFound] = 0 [start] = 0 [docs] = ) [debug] = SolrObject Object ( [rawquerystring] = (smsc_content:\abcdefg12345 678910\ || smsc_description:\abcdefg12345 678910\) (smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) (smsc_ssid:929) [querystring] = (smsc_content:\abcdefg12345 678910\ || smsc_description:\abcdefg12345 678910\) (smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) (smsc_ssid:929) [parsedquery] = +(smsc_content:abcdefg12345 smsc_content:678910 smsc_description:abcdefg12345 smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] +smsc_ssid:929 [parsedquery_toString] = +(smsc_content:abcdefg12345 smsc_content:678910 smsc_description:abcdefg12345 smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] +smsc_ssid:`#8;#0;#0;#7;! [QParser] = LuceneQParser [explain] = SolrObject Object ( ) ) ) Anybody an idea what's wrong? -- Met vriendelijke groeten Arkadi Colson Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen T +32 11 64 08 80 . F +32 11 64 08 81
RE: highlighting multiple occurrences
Did you mean that you want multiple snippets? http://wiki.apache.org/solr/HighlightingParameters#hl.snippets -Original Message- From: Rafael Ribeiro [mailto:rafae...@gmail.com] Sent: Monday, December 10, 2012 11:20 AM To: solr-user@lucene.apache.org Subject: highlighting multiple occurrences Hi all, I have a solr instance with one field configured for highlighting as follows: str name=hlon/str str name=hl.flconteudo/str str name=hl.fragsize500/str str name=hl.maxAnalyzedChars9/str str name=hl.simple.prelt;font style=background-color: yellowgt;/str but I was willing to have the highlighter display multiple occurrences of the query instead of the first one... is it possible? I tried searching this mailing list but I couldn't find anyone mentioning this... best regards, Rafael -- View this message in context: http://lucene.472066.n3.nabble.com/highlighting-multiple-occurrences-tp4025715.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there a way to round data when index, but still able to return original content?
When you apply your analyzers/filters/tokenizers, the result value is kept in the indexed; however, the input value is actually stored. For example, from schema.xml file: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType This particular field type will strip out the HTML. So if the input is: bHello/b It's being tokenized in the index as Hello It's being stored (and hence returned to you) as bHello/b So you can create your own charFilter or filter class which converts your date for the indexer, but the original data will automatically be stored. I hope this makes sense. -Original Message- From: jefferyyuan [mailto:yuanyun...@gmail.com] Sent: Monday, December 10, 2012 10:24 AM To: solr-user@lucene.apache.org Subject: Re: Is there a way to round data when index, but still able to return original content? Erick, Thanks for your reply. I know how to implement the solution 1. But no idea how yo implement the solution 2 you mentioned: === If you put some sort of (perhaps custom) filter in place, then the original value would go in as stored and the altered value would get in the index and you could do both in the same field. Can you please describe more about how to store original data and index the altered value in the same filed? Thanks :) -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025695.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: highlighting multiple occurrences
Rafael, Can you share more on how you are rendering the results in your velocity template? The data is probably being sent to you, but you have to loop through and actually access the data. -Original Message- From: Rafael Ribeiro [mailto:rafae...@gmail.com] Sent: Monday, December 10, 2012 2:26 PM To: solr-user@lucene.apache.org Subject: RE: highlighting multiple occurrences yep! I tried enabling this and settings various values bot no success... still it only shows the first fragment of the search found... I also saw this http://lucene.472066.n3.nabble.com/hl-snippets-in-solr-3-1-td2445178.html but increasing maxAnalyzedChars (that was already huge) produced no difference at all. Do I have to change anything else? For example, something on the velocity template??? best regards, Rafael -- View this message in context: http://lucene.472066.n3.nabble.com/highlighting-multiple-occurrences-tp4025715p4025771.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there a way to round data when index, but still able to return original content?
Hi, Nope...they don't. Generally, I am not sure if I'd bother rounding this information to reduce the index size. Have you determined how much index size space you'll actually be saving? I am not confident that it'd be worth your time; i.e. I'd just go with indexing/storing the time information as well. Regardless, if you do want to go this route, the only way I can think of that wouldn't be a complicated solution is to have one field that is indexed/rounded (and not stored) and another field that is just stored (and not indexed). Hope this helps. -Original Message- From: jefferyyuan [mailto:yuanyun...@gmail.com] Sent: Monday, December 10, 2012 3:14 PM To: solr-user@lucene.apache.org Subject: RE: Is there a way to round data when index, but still able to return original content? Sorry to ask a question again, but I want to round date(TireDate) and TrieLongField, seems they don't support configuring analyzer: charFilter , tokenizer or filter. What I should do? Now I am thinking to write my custom date or long field, is there any other way? :) Thanks :) -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025793.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Selective field level security
Hi Nalini, We had similar requirements and this is how we did it (using your example): Record A: Field1_All: something Field1_Private: something Field2_All: '' Field2_Private: something private Field3_All: '' Field3_Private: something very private Fields_All: something Fields_Private: something something private something very private Basically, we're just using a lot of copy fields and dynamic fields. Instead of storing a type, we just change the column name. So if someone who had access to private fields, we would perform our search in the private column fields: (fields_private:something) Or if you want a specific field: (field1_private:something) OR (field2_private:something) or (field3_private:something) Likewise, if someone didn't have access to the private fields, we would only search in the all fields. We also created a super field so that we don't have to search each individual field -- we use copyfields to copy all private fields into the super field and just search that. I hope this helps. Swati -Original Message- From: Nalini Kartha [mailto:nalinikar...@gmail.com] Sent: Monday, September 17, 2012 2:45 PM To: solr-user@lucene.apache.org Subject: Selective field level security Hi, We're trying to push some security related info into the index which will control which users can search certain fields and we're wondering what the best way to accomplish this is. Some records that are being indexed and searched can have certain fields marked as private. When a field is marked as private, some querying users should not see/search on it whereas some super users can. Here's the solutions we're considering - - Index a separate boolean value into a new _INTERNAL field to indicate if the corresponding field value is marked private or not and include a filter in the query when the searching user is not a super user. So for eg., consider that a record can contain 3 fields - field[123] where field1 and field2 can be marked as private but field3 cannot. Record A has only field1 marked as private, record B has both field1 and field2 marked as private. When we index these records here's what we'd end up with in the index - Record A - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:0, field3:something Record B - field1:something, field1_INTERNAL:1, field2:something, field2_INTERNAL:1, field3:something If the searching user is NOT a super user then the query (let's say it's 'hidden security') needs to look like this- ((field3:hidden) OR (field1:hidden AND field1_INTERNAL:0) OR (field2:hidden AND field2_INTERNAL:0)) AND ((field3:security) OR (field1:security AND field1_INTERNAL:0) OR (field2:security AND field2_INTERNAL:0)) Manipulating the query this way seems painful and error prone so we're wondering if Solr provides anything out of the box that would help with this? - Index the private values themselves into a separate _INTERNAL field and then determine which fields to query depending on the visibility of the searching user. So using the example from above, here's what the indexed records would look like - Record A - field1_INTERNAL:something, field2:something, field3:something Record B - field1_INTERNAL:something, field2_INTERNAL:something, field3:something If the searching user is NOT a super user then the query just needs to be against the regular fields whereas if the searching user IS a super user, the query needs to be against BOTH the regular and INTERNAL fields. The issue with this solution is that since the number of docs that include the INTERNAL fields is going to be much fewer we're wondering if relevancy would be messed up when we're querying both regular and internal fields for super users? Thoughts? Thanks, Nalini
RE: Stats field with decimal values
You can use an XSL response writer to transform your values to have a different precision. http://wiki.apache.org/solr/XsltResponseWriter Would most likely be better for your client to just do it on his end though. He is probably parsing the response anyway. -Original Message- From: Gustav [mailto:xbihy...@sharklasers.com] Sent: Monday, September 17, 2012 1:10 PM To: solr-user@lucene.apache.org Subject: Re: Stats field with decimal values Well, my client is asking if is it possible, im just providing the search enginne to him, not working directly with the application. Dont know exactly in what language he is programming. -- View this message in context: http://lucene.472066.n3.nabble.com/Stats-field-with-decimal-values-tp4008292p4008395.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr Index problem
Are you committing? You have to commit for them to be actually added -Original Message- From: ranmatrix S [mailto:ranmat...@gmail.com] Sent: Thursday, August 23, 2012 5:46 PM To: solr-user@lucene.apache.org Subject: Solr Index problem Hi, I have setup Solr to index data from Oracle DB through DIH handler. However through Solr admin I could see the DB connection is successfull, data retrieved from DB to Solr but not added into index. The message is that 0 documents added even when I am able to see that 9 records are returned back. The schema and fields in db-data-config.xml are one and the same. Please suggest if anything I should look for. -- Regards, Ran...
RE: DataImportHandler WARNING: Unable to resolve variable
I am getting a similar issue when while using a Template Transformer. My fields *always* have a value as well - it is getting indexed correctly. Furthermore, the number of warnings I get seems arbitrary. I imported one document (debug mode) and I got roughly ~400 of those warning messages for the single field. -Original Message- From: Jon Drukman [mailto:jdruk...@gmail.com] Sent: Thursday, August 09, 2012 2:38 PM To: solr-user@lucene.apache.org Subject: DataImportHandler WARNING: Unable to resolve variable I'm trying to use DataImportHandler's delta-import functionality but I'm getting loads of these every time it runs: WARNING: Unable to resolve variable: article.url_type while parsing expression: article:${article.url_type}:${article.id} The definition looks like: entity name=article query=... irrelevant ... deltaQuery=select id,'dummy' as type_id FROM articles WHERE (post_date gt; '${dataimporter.last_index_time}' OR updated_date gt; '${dataimporter.last_index_time}') AND post_date lt;= NOW() AND status = 9 deltaImportQuery=select id, article_seo_title, DATE_FORMAT(post_date,'%Y-%m-%dT%H:%i:%sZ') post_date, subject, body, IF(url_type='', 'article', url_type) url_type, featured_image_url from articles WHERE id = ${dataimporter.delta.id} transformer=TemplateTransformer,HTMLStripTransformer field column=id name=id / field column=post_date name=post_date / field column=subject name=title / field column=body name=subhead stripHTML=true / field column=type_id template=article:${article.url_type}:${ article.id} / field column=type template=2 / field column=featured_image_url name=main_image / field column=article_seo_title name=seo_title / /entity As you can see, I am always making sure that article.url_type has some value. Why am I getting the warning? -jsd-
RE: DataImportHandler WARNING: Unable to resolve variable
Ah, my bad. I was incorrect - it was not actually indexing. @Jon - is there a possibility that your url_type is NULL, but not empty? Your if check only checks to see if it is empty, which is not the same as checking to see if it is null. If it is null, that's why you'd be having those errors - null values are just not accepted, it seems. Swati -Original Message- From: Swati Swoboda [mailto:sswob...@igloosoftware.com] Sent: Thursday, August 09, 2012 11:09 PM To: solr-user@lucene.apache.org Subject: RE: DataImportHandler WARNING: Unable to resolve variable I am getting a similar issue when while using a Template Transformer. My fields *always* have a value as well - it is getting indexed correctly. Furthermore, the number of warnings I get seems arbitrary. I imported one document (debug mode) and I got roughly ~400 of those warning messages for the single field. -Original Message- From: Jon Drukman [mailto:jdruk...@gmail.com] Sent: Thursday, August 09, 2012 2:38 PM To: solr-user@lucene.apache.org Subject: DataImportHandler WARNING: Unable to resolve variable I'm trying to use DataImportHandler's delta-import functionality but I'm getting loads of these every time it runs: WARNING: Unable to resolve variable: article.url_type while parsing expression: article:${article.url_type}:${article.id} The definition looks like: entity name=article query=... irrelevant ... deltaQuery=select id,'dummy' as type_id FROM articles WHERE (post_date gt; '${dataimporter.last_index_time}' OR updated_date gt; '${dataimporter.last_index_time}') AND post_date lt;= NOW() AND status = 9 deltaImportQuery=select id, article_seo_title, DATE_FORMAT(post_date,'%Y-%m-%dT%H:%i:%sZ') post_date, subject, body, IF(url_type='', 'article', url_type) url_type, featured_image_url from articles WHERE id = ${dataimporter.delta.id} transformer=TemplateTransformer,HTMLStripTransformer field column=id name=id / field column=post_date name=post_date / field column=subject name=title / field column=body name=subhead stripHTML=true / field column=type_id template=article:${article.url_type}:${ article.id} / field column=type template=2 / field column=featured_image_url name=main_image / field column=article_seo_title name=seo_title / /entity As you can see, I am always making sure that article.url_type has some value. Why am I getting the warning? -jsd-