Text formatting lost
Hi, I'm a newbie and have a question about the text that is stored and then returned from a query. The field in question is of type "text", is indexed and stored. The original text included various blank lines (line feeds), but when the text field is returned as the result from a query, all of the blank lines and extra spaces have been removed. Since I am storing the content for the purpose of displaying, I need the original format to be preserved. Is this possible? I tried changing it to indexed="false" and using a copyvalue to copy it to the general text field for indexing, but this didn't help. Thanks! Mike _ Hotmail: Trusted email with powerful SPAM protection. http://clk.atdmt.com/GBL/go/177141665/direct/01/
Re: using q= , adding fq=
: > 1) adding something like: q=cat_id:xxx&fq=geo_id= would boost : > performance? : : : For the n > 1 query, yes, adding filters should improve performance : assuming it is selective enough. The tradeoff is memory. You might even find that something like this is faster... q=*:*&fq=cat_id:&fq=geo_id: ...but it can vary based on circumstances (depends a lot on how many unique and values you have, and how big each of those sets are, and how big you make your filterCache) : > 2) we do find problems when we ask for a page=large offset! ie: : > q=cat_id:xxx and geo_id:yyy&start=544545 : > (note that we limit docs to 50 max per resultset). : > When start is 500 or more, Qtime is >=5 seconds while the avg qtime is : > <100 ms FWIW: limiting the number of rows per request to 50, but not limiting the start doesn't make much sense -- the same amount of work is needed to handle start=0&rows=5050 and start=5000&rows=50. There are very few use cases for allowing people to iterate through all the rows that also require sorting. -Hoss
RE: SolrPlugin Guidance
: Our QParser plugin will perform queries against directory documents and : return any file document that has the matching directory id(s). So the : plugin transforms the query to something like : : q:+(directory_id:4 directory:10) +directory_id:(4) ... : Currently the parser plugin is doing the lookup queries via the standard : request handler. The problem with this approach is that the look up : queries are going to be analyzed twice. This only seems to be a problem ...you lost me there. if you are taking part of the query, and using it to get directory ids, and then using those directory ids to build a new query, why are you ever passing the output from one query parser to another query parser? You take the input string, you let the LuceneQParser parse it and use it to search against "Directory" documents, and then you iterate over hte results, and get an ID from them. You should be using those IDs directly to build your new query. Honestly: even if you were using those ids to build a query string, and then pass that string to hte analyzer, i don't see why stemming would cause any problems for you if the ids are numbers (like in your example) -Hoss
Re: Using facets to narrow results with multiword field
: : I'm using facet.field=lbrand and do get good results for eg: Geomax, GeoMax, : GEOMAX all of them falls into "geomax". But when I'm filtering I do get : strange results: : : brand:geomax gives numFound="0" : lbrand:geomax gives numFound="57" (GEOMAX, GeoMag, Geomag) : : How should I redefine brand to let narrow work correctly? I'm not sure i understand what it is that isn't working for you ... if you are faceting on "lbrand" then you should filter on "lbrand" as well ... your query for "brand:geomax" is probably failing because you don't actually have "geomax" as a value for any doc -- which is what you should expect, since you didn't use a LowercaseFilter. correct? -Hoss
RE: Can solr web site have multiple versions of online API doc?
Israel, > If you downloaded the 1.3.0 release, you should find a "docs" > folder inside the zip file. > > This contains the javadoc for that particular release. > > You may also re download a 1.3.0 release to get the docs for Solr 1.3. This doesn't solve my problem. I can't write my javadoc comments referencing to Solr API doc located in my local hard drive. The Solr API doc needs to be located in the Internet. Various versions of J2SE (JDK) API doc and Lucene API doc are available online at well-defined URLs. I'd like to have Solr API docs available in the similar manner. Kuro
Re: Filter exclusion on query facets?
Yes, you can tag filters using the new local params format and then explicitly exclude them when providing the facet fields. see: http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters Cheers, Uri Mat Brown wrote: Hi all, Just wondering if it's possible to do filter exclusion (i.e., multiselect faceting) on query facets in Solr 1.4? Thanks! Mat
store content only of documents
I store document in a field "content" field defiend as follow in schema.xml and following in solrconfig.xml content content I want to store only "content" into this field but it store other meta data of a document e.g. "Author", "timestamp", "document type" etc. how can I ask solr to store only body of document into this field and not other meta data? Thanks, -- View this message in context: http://old.nabble.com/store-content-only-of-documents-tp26803101p26803101.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr client query vs Solr search query
Hello, I had this question to ask regarding building of a Solr query . On the solr server running on a linux box my query that returns results is as follows , this one of course returns the search results http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile/select/?q=Bangalore&version=2.2&start=0&rows=10&indent=on However when I try to access the same Solr server using a webapp on tomcat if I print out the query it comes out as : http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile?q=bangalore&qt=/profile&rows=100&wt=javabin&version=1 Note the second query is missing the "select" clause among other things that follow . This one does not return the results back to me . My question is am I building my query wrong in my client , could somebody show me the way? With Regards Sri -- View this message in context: http://old.nabble.com/Solr-client-query-vs-Solr-search-query-tp26802634p26802634.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr client query vs Solr search query
Hello, I had this question to ask regarding building of a Solr query . On the solr server running on a linux box my query that returns results is as follows , this one of course returns the search results http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile/select/?q=Bangalore&version=2.2&start=0&rows=10&indent=on However when I try to access the same Solr server using a webapp on tomcat if I print out the query it comes out as : http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile?q=bangalore&qt=/profile&rows=100&wt=javabin&version=1 Note the second query is missing the "select" clause among other things that follow . This one does not return the results back to me . My question is am I building my query wrong in my client , could somebody show me the way? With Regards Sri -- View this message in context: http://old.nabble.com/Solr-client-query-vs-Solr-search-query-tp26802513p26802513.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcard oddity
Do you get the same behavior if you search for "gang" instead of "gets"? I'm wondering if there's something going on with stemEnglishPossesive. According to the docs you *should* be OK since you set setmEnglishPosessive=0, but this would help point in the right direction. Also, am I correct in assuming that that is the analyzer both for indexing AND searching? Best Erick On Tue, Dec 15, 2009 at 3:30 PM, Joe Calderon wrote: > im trying to do a wild card search > > "q":"item_title:(gets*)"returns no results > "q":"item_title:(gets)"returns results > "q":"item_title:(get*)"returns results > > > seems like * at the end of a token is requiring a character, instead > of being 0 or more its acting like1 or more > > the text im trying to match is "The Gang Gets Extreme: Home Makeover > Edition" > > the field uses the following analyzers > > positionIncrementGap="100" omitNorms="false"> > > > > > > generateWordParts="1" generateNumberParts="0" catenateAll="1" > splitOnNumerics="0" splitOnCaseChange="0" stemEnglishPossessive="0" /> > > > > > is anybody else having similar problems? > > > best, > --joe >
Re: Log of zero result searches
Chris Hostetter wrote: See Also: http://en.wikipedia.org/wiki/Thread_hijacking You may want to update that link, since that wikipedia page has been deleted for some time. cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: facet.field problem in SolrParams to NamedList
: Ej.: "q=something field:value" becomes "q=something value&fq=field:value" : : To do this, in the createParser method, I apply a regular expression : to the qstr param to obtain the fq part, and then I do the following: : : NamedList paramsList = params.toNamedList(); : paramsList.add(CommonParams.FQ, generatedFilterQuery); : params = SolrParams.toSolrParams(paramsList); : req.setParams(params); ... : SolrParams.toNameList() was saving the array correctly, but the method : SolrParams.toSolrParams(NamedList) was doing: : "params.getVal(i).toString()". So, it always loses the array. I'm having trouble thinking through exactly where the problem is being introduced here ... ultimately what it comes down to is that the NamedList souldn't be containing a String[] ... it should be containing multiple string values with the same name ("fq") It would be good to make sure all of these methods play nicely with one another so some round trip conversions worked as expected -- so if you could open a bug for this with a simple example test case that would be great, ...but... for your purposes, i would skip the NamedList conversion alltogether, and just use AppendedSolrParams... MapSolrParams myNewParams = new MapSolrParams(); myNewParams.getMap().put("fq", generatedFileterQuery); myNewParams.getMap().put("q", generatedQueryString); req.setParams(new AppendedSolrParams(myNewParams, originalPrams)); -Hoss
RE: Request Assistance with DIH
Thanks for the reply, just what I was looking for in answer. I am running under Tomcat 6 on Solaris 10, the person that replied before you looks like they running under jetty. I have configured jndi context. I stop and start tomcat using the Solaris SMF, equivalent to services in linux. But my cwd is point to root, I have solr home specified in Catalina/localhost/solr.xml. Is there anything else that I can do to force cwd to point to solr/home? Thanks again Robbin -Original Message- From: Ken Lane (kenlane) [mailto:kenl...@cisco.com] Sent: Monday, December 14, 2009 11:04 AM To: solr-user@lucene.apache.org Subject: RE: Request Assistance with DIH Hi Robbin, I just went through this myself (I am a newbie). The key things to look at are: 1. Your data_config.xml. I created a table called 'foo' and an ora_data_config.xml file with a simple example to get it working that looks like this: Some gotcha's: If your Oracle DB is configured with Service_name rather than SID (ie. You may be running failover, RAC, etc), the url parameter of jdbc connection can read like this: url="jdbc:oracle:thin:@(DESCRIPTION = (LOAD_BALANCE = on) (FAILOVER = on) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = .cisco.com)(PORT = 1528))) (CONNECT_DATA = (SERVICE_NAME = ..COM)))" 2. In your solr_config.xml file, add something like this to reference the above listed file: ora-data-config.xml 3. I have Solr1.4 running under Tomcat 6. It looks like you are trying the jetty example, but pay mind to getting the "cwd" pointing to your solr home by setting your JNDI path as described in the dataimporthandler wiki. 4. When it blows up, as it did numerous times for me until I got it right, check the logs. As I am running under Tomcat, I was able to check \logs\catalina.2009-12-14.log to view DIH errors both upon restart of Tomcat and after running the DIH. 5. There are some tools to check your JDBC connection you might try before pulling too much of your hair out. Try here: http://otn.oracle.com/sample_code/tech/java/sqlj_jdbc/content.html Good Luck! Ken -Original Message- From: Turner, Robbin J [mailto:robbin.j.tur...@boeing.com] Sent: Monday, December 14, 2009 10:27 AM To: solr-user@lucene.apache.org Subject: RE: Request Assistance with DIH How does this help answer my question? I am trying to use the DATAImportHandler Development console. The url you suggest assumes I had it working already. Looking at my logs and the response to the Development console, it does not appear that the connection to Oracle is being made. So if someone could offer some configuration/connection setup directions I would very much appreciate it. Thanks Robbin -Original Message- From: Joel Nylund [mailto:jnyl...@yahoo.com] Sent: Friday, December 11, 2009 8:26 PM To: solr-user@lucene.apache.org Subject: Re: Request Assistance with DIH add ?command=full-import to your url http://localhost:8983/solr/dataimport?command=full-import thanks Joel On Dec 11, 2009, at 7:45 PM, Robbin wrote: > I've been trying to use the DIH with oracle and would love it if > someone could give me some pointers. I put the ojdbc14.jar in both > the Tomcat lib and /lib. I created a dataimport.xml and > enabled it in the solrconfig.xml. I go to the http:/// > solr/admin/dataimport.jsp. This all seems to be fine, but I get the > default page response and doesn't look like the connection to the > oracle server is even attempted. > > I'm using the Solr 1.4 release on Nov 10. > Do I need an oracle client on the server? I thought having the ojdbc > jar should be sufficient. Any help or configuration examples for > setting this up would be much appreciated. > > Thanks > Robbin
Re: Log of zero result searches
: Subject: Log of zero result searches : References: <26747482.p...@talk.nabble.com> <26748588.p...@talk.nabble.com> : <359a9283091203m73b4dc9ya51aa97e460b3...@mail.gmail.com> : <26756663.p...@talk.nabble.com> <26776651.p...@talk.nabble.com> : <359a92830912141657r79881e4bg3a4370d81ea7e...@mail.gmail.com> : In-Reply-To: <359a92830912141657r79881e4bg3a4370d81ea7e...@mail.gmail.com> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking -Hoss
Re: Can solr web site have multiple versions of online API doc?
2009/12/15 Teruhiko Kurosaka > Lucene keeps multiple versions of its API doc online at > http://lucene.apache.org/java/X_Y_Z/api/all/index.html > for version X.Y.Z. I am finding this very useful when > comparing different versions. This is also good because > the javadoc comments that I write for my software can > reference the API comments of the exact version of > Lucene that I am using. > > At Solr site, I can only find the API doc of the trunk > build. I cannot find 1.3.0 API doc, for example. > > Can Solr site also maintain the API docs for the past > stable versions ? > > -kuro Hi Teruhiko If you downloaded the 1.3.0 release, you should find a "docs" folder inside the zip file. This contains the javadoc for that particular release. You may also re download a 1.3.0 release to get the docs for Solr 1.3. I hope this helps. -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Reverse sort facet query
: Does anyone know of a good way to perform a reverse-sorted facet query (i.e. rarest first)? I'm fairly confident that code doesn't exist at the moment. If i remember correctly, it would be fairly simply to implement if you'd like to submit a patch: when sorting by count a simple bounded priority queue is used, so we'd just have the change the comparator. If you're interested in working on a patch it should be in SimpleFacets.java. I think the queue is called "BoundedTreeSet" (that's a pretty novel request actually ... i don't remember anyone else ever asking for anything like this before .. can you describe your use case a bit -- i'm curious as to how/when you would use this data) -Hoss
Re: Converting java date to solr date and querying dates
: i want to store dates into a date field called publish date in solr. : how do we do it using solrj I'm pretty sure that when indexing docs, you can add Date objects directly to the SolrInputDocument as field values -- but 'm not 100% certain (i don't use SolrJ much) : likewise how do we query from solr using java date? do we always have : to convert it into UTC field and then query it? all of the query APIs are based on query strings -- so yes you need to construct hte query string on your client side, and yes that includes formatting in UTC. : How do i query solr for documents published on monday or for documents : published on March etc. if you mean "march of any year" or "any monday ever" then there isn't any built in support for anything like that ... your best bet would either be to add "month_of_year" and "day_of_week" fields and populate them in your client code, or write an UpdateProcessor to run in solr (that could be pretty generic if you wnat ot contribute it back, other people could find it useful) If you mean "published in the most recent march" or "published on hte most recent monday" where you don't have to change anything to have the query "do what i mean" as time moves on then you'd either need to do that when building up your query, or write it as a QParser plugin. : or in that case even apply range queries on it?? basic range queries are easy... http://wiki.apache.org/solr/SolrQuerySyntax http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html -Hoss
Filter exclusion on query facets?
Hi all, Just wondering if it's possible to do filter exclusion (i.e., multiselect faceting) on query facets in Solr 1.4? Thanks! Mat
Re: Spellchecking - Is there a way to do this?
: My first problem appears because I need suggestions inclusive when the : expression has returned results. It's seems that only appear : suggestions when there are no results. Is there a way to do so? can you give us an example of what your queries look like? with the example configs, i can get matches, as well as suggestions... http://localhost:8983/solr/spell?q=ide&spellcheck=true : The second question is: For the purposes that I've mentioned, is the : best way to use spellchecker or mlt component? Or some other (as a : fuzzy query)? there's no clear cut answer to that -- i don't remember anyone else ever asking about anything particularly similar to what you're doing, so i don't know that there is any precident for a "best" way to go about it. -Hoss
Re: Concurrent Merge Scheduler & MaxThread Count
: I'm having trouble getting Solr to use more than one thread during index : optimizations. I have the following in my solrconfig.xml: : : 6 : How many segments do you have? I'm not an expert on segment merging, but i'm pretty sure the number of threads it will use is limited based on the number of segments -- so even though you say "use up to 8" it only uses one if that's all htat it can use. -Hoss
wildcard oddity
im trying to do a wild card search "q":"item_title:(gets*)"returns no results "q":"item_title:(gets)"returns results "q":"item_title:(get*)"returns results seems like * at the end of a token is requiring a character, instead of being 0 or more its acting like1 or more the text im trying to match is "The Gang Gets Extreme: Home Makeover Edition" the field uses the following analyzers is anybody else having similar problems? best, --joe
synonyms
Hi It appears that Solr reads a synonym list at startup from a text file. Is it possible to alter this behaviour so that Solr obtains the synonym list from a database instead? Thanks, Peter
Re: Using lucenes custom filters in solr
> Hi All, > > I have a custom filter for lucene ,Can > anyone help me how to use this > in SOLR. http://wiki.apache.org/solr/SolrPlugins#Tokenizer_and_TokenFilter http://wiki.apache.org/solr/SolrPlugins#Analyzer
Can solr web site have multiple versions of online API doc?
Lucene keeps multiple versions of its API doc online at http://lucene.apache.org/java/X_Y_Z/api/all/index.html for version X.Y.Z. I am finding this very useful when comparing different versions. This is also good because the javadoc comments that I write for my software can reference the API comments of the exact version of Lucene that I am using. At Solr site, I can only find the API doc of the trunk build. I cannot find 1.3.0 API doc, for example. Can Solr site also maintain the API docs for the past stable versions ? -kuro
Re: Using lucenes custom filters in solr
Hi All, I have a custom filter for lucene ,Can anyone help me how to use this in SOLR. Thanks in advance, Pavan
facet.field problem in SolrParams to NamedList
Hi! I wrote a subclass of DisMaxQParserPlugin to add a little filter for processing the "q" param and generate a "fq" param. Ej.: "q=something field:value" becomes "q=something value&fq=field:value" To do this, in the createParser method, I apply a regular expression to the qstr param to obtain the fq part, and then I do the following: NamedList paramsList = params.toNamedList(); paramsList.add(CommonParams.FQ, generatedFilterQuery); params = SolrParams.toSolrParams(paramsList); req.setParams(params); The problem is when I include two "facet.field" in the request. In the results (facets section) it prints "[Ljava.lang.String;@c77a748", which is the result of a toString() over an String[] . So, getting a little in deep in the code, I saw the method SolrParams.toNameList() was saving the array correctly, but the method SolrParams.toSolrParams(NamedList) was doing: "params.getVal(i).toString()". So, it always loses the array. Something similar occurs with the methods SolrParams.toMap() and SolrParams.toMultiMap(). Is this a bug ? thanks. Nestor
Re: Exception from Spellchecker
Hi Rafael, Rafael Pappert wrote: I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on debian). But I always get the following exception, when I try to open http://localhost:8080/spell?: The spellcheck=true pair is missing in your request. Try http://localhost:8080/spell?q=&spellcheck=true -Sascha
Exception from Spellchecker
Hello List, I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on debian). But I always get the following exception, when I try to open http://localhost:8080/spell?: HTTP Status 500 - null java.lang.NullPointerException at java.io.StringReader.(StringReader.java:33) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78) at org.apache.solr.search.QParser.getQuery(QParser.java:131) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Thread.java:619) My Configuration looks like this: solrconfig.xml textSpell a_spell a_spell true false ./spellchecker false false 1 spellcheck schema.xml .. I don't know what's wrong with the given configuration and the exception not really clear ;) Can somebody give me a hint? Thank you in anticipation. Best regards, Rafael.
Re: Document model suggestion
Erick, I know what you mean. Wonder if it is actually cleaner to keep the authorization model out of solr index and filter the data at client side based on the user access rights. Thanks all for help. Erick Erickson wrote: > > Yes, that should work. One hard part is what happens if your > authorization model has groups, especially when membership > in those groups changes. Then you have to go in and update > all the affected docs. > > FWIW > Erick > > On Tue, Dec 15, 2009 at 12:24 PM, caman > wrote: > >> >> Shalin, >> >> Thanks. much appreciated. >> Question about: >> "That is usually what people do. The hard part is when some documents >> are >> shared across multiple users. " >> >> What do you recommend when documents has to be shared across multiple >> users? >> Can't I just multivalue a field with all the users who has access to the >> document? >> >> >> thanks >> >> Shalin Shekhar Mangar wrote: >> > >> > On Tue, Dec 15, 2009 at 7:26 AM, caman >> > wrote: >> > >> >> >> >> Appreciate any guidance here please. Have a master-child table between >> >> two >> >> tables 'TA' and 'TB' where form is the master table. Any row in TA can >> >> have >> >> multiple row in TB. >> >> e.g. row in TA >> >> >> >> id---name >> >> 1---tweets >> >> >> >> TB: >> >> id|ta_id|field0|field1|field2.|field20|created_by >> >> 1|1|value1|value2|value2.|value20|User1 >> >> >> >> >> > >> >> >> >> This works fine and index the data.But all the data for a row in TA >> gets >> >> combined in one document(not desirable). >> >> I am not clear on how to >> >> >> >> 1) separate a particular row from the search results. >> >> e.g. If I search for 'Android' and there are 5 rows for android in TB >> for >> >> a >> >> particular instance in TA, would like to show them separately to user >> and >> >> if >> >> the user click on any of the row,point them to an attached URL in the >> >> application. Should a separate index be maintained for each row in >> TB?TB >> >> can >> >> have millions of rows. >> >> >> > >> > The easy answer is that whatever you want to show as results should be >> the >> > thing that you index as documents. So if you want to show tweets as >> > results, >> > one document should represent one tweet. >> > >> > Solr is different from relational databases and you should not think >> about >> > both the same way. De-normalization is the way to go in Solr. >> > >> > >> >> 2) How to protect one user's data from another user. I guess I can >> keep >> a >> >> column for a user_id in the schema and append that filter >> automatically >> >> when >> >> I search through SOLR. Any better alternatives? >> >> >> >> >> > That is usually what people do. The hard part is when some documents >> are >> > shared across multiple users. >> > >> > >> >> Bear with me if these are newbie questions please, this is my first >> day >> >> with >> >> SOLR. >> >> >> >> >> > No problem. Welcome to Solr! >> > >> > -- >> > Regards, >> > Shalin Shekhar Mangar. >> > >> > >> >> -- >> View this message in context: >> http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://old.nabble.com/Document-model-suggestion-tp26784346p26799016.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Document model suggestion
Yes, that should work. One hard part is what happens if your authorization model has groups, especially when membership in those groups changes. Then you have to go in and update all the affected docs. FWIW Erick On Tue, Dec 15, 2009 at 12:24 PM, caman wrote: > > Shalin, > > Thanks. much appreciated. > Question about: > "That is usually what people do. The hard part is when some documents are > shared across multiple users. " > > What do you recommend when documents has to be shared across multiple > users? > Can't I just multivalue a field with all the users who has access to the > document? > > > thanks > > Shalin Shekhar Mangar wrote: > > > > On Tue, Dec 15, 2009 at 7:26 AM, caman > > wrote: > > > >> > >> Appreciate any guidance here please. Have a master-child table between > >> two > >> tables 'TA' and 'TB' where form is the master table. Any row in TA can > >> have > >> multiple row in TB. > >> e.g. row in TA > >> > >> id---name > >> 1---tweets > >> > >> TB: > >> id|ta_id|field0|field1|field2.|field20|created_by > >> 1|1|value1|value2|value2.|value20|User1 > >> > >> > > > >> > >> This works fine and index the data.But all the data for a row in TA gets > >> combined in one document(not desirable). > >> I am not clear on how to > >> > >> 1) separate a particular row from the search results. > >> e.g. If I search for 'Android' and there are 5 rows for android in TB > for > >> a > >> particular instance in TA, would like to show them separately to user > and > >> if > >> the user click on any of the row,point them to an attached URL in the > >> application. Should a separate index be maintained for each row in TB?TB > >> can > >> have millions of rows. > >> > > > > The easy answer is that whatever you want to show as results should be > the > > thing that you index as documents. So if you want to show tweets as > > results, > > one document should represent one tweet. > > > > Solr is different from relational databases and you should not think > about > > both the same way. De-normalization is the way to go in Solr. > > > > > >> 2) How to protect one user's data from another user. I guess I can keep > a > >> column for a user_id in the schema and append that filter automatically > >> when > >> I search through SOLR. Any better alternatives? > >> > >> > > That is usually what people do. The hard part is when some documents are > > shared across multiple users. > > > > > >> Bear with me if these are newbie questions please, this is my first day > >> with > >> SOLR. > >> > >> > > No problem. Welcome to Solr! > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > > > > > -- > View this message in context: > http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Document model suggestion
Shalin, Thanks. much appreciated. Question about: "That is usually what people do. The hard part is when some documents are shared across multiple users. " What do you recommend when documents has to be shared across multiple users? Can't I just multivalue a field with all the users who has access to the document? thanks Shalin Shekhar Mangar wrote: > > On Tue, Dec 15, 2009 at 7:26 AM, caman > wrote: > >> >> Appreciate any guidance here please. Have a master-child table between >> two >> tables 'TA' and 'TB' where form is the master table. Any row in TA can >> have >> multiple row in TB. >> e.g. row in TA >> >> id---name >> 1---tweets >> >> TB: >> id|ta_id|field0|field1|field2.|field20|created_by >> 1|1|value1|value2|value2.|value20|User1 >> >> > >> >> This works fine and index the data.But all the data for a row in TA gets >> combined in one document(not desirable). >> I am not clear on how to >> >> 1) separate a particular row from the search results. >> e.g. If I search for 'Android' and there are 5 rows for android in TB for >> a >> particular instance in TA, would like to show them separately to user and >> if >> the user click on any of the row,point them to an attached URL in the >> application. Should a separate index be maintained for each row in TB?TB >> can >> have millions of rows. >> > > The easy answer is that whatever you want to show as results should be the > thing that you index as documents. So if you want to show tweets as > results, > one document should represent one tweet. > > Solr is different from relational databases and you should not think about > both the same way. De-normalization is the way to go in Solr. > > >> 2) How to protect one user's data from another user. I guess I can keep a >> column for a user_id in the schema and append that filter automatically >> when >> I search through SOLR. Any better alternatives? >> >> > That is usually what people do. The hard part is when some documents are > shared across multiple users. > > >> Bear with me if these are newbie questions please, this is my first day >> with >> SOLR. >> >> > No problem. Welcome to Solr! > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Payloads with Phrase queries
Lucene 2.9.1 comes with a PayloadTermQuery: http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/search/payloads/PayloadTermQuery.html I have been using that to use the payload as part of the score without any problem. Bill On Tue, Dec 15, 2009 at 6:31 AM, Raghuveer Kancherla < raghuveer.kanche...@aplopio.com> wrote: > The interesting thing I am noticing is that the scoring works fine for a > phrase query like "solr rocks". > This lead me to look at what query I am using in case of a single term. > Turns out that I am using PayloadTermQuery taking a cue from solr-1485 > patch. > > I changed this to BoostingTermQuery (i read somewhere that this is > deprecated .. but i was just experimenting) and the scoring seems to work > as > expected now for a single term. > > Now, the important question is what is the Payload version of a TermQuery? > > Regards > Raghu > > > On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla < > raghuveer.kanche...@aplopio.com> wrote: > > > Hi, > > Thanks everyone for the responses, I am now able to get both phrase > queries > > and term queries to use payloads. > > > > However the the score value for each document (and consequently, the > > ordering of documents) are coming out wrong. > > > > In the solr output appended below, document 4 has a score higher than the > > document 2 (look at the debug part). The results section shows a wrong > score > > (which is the payload value I am returning from my custom similarity > class) > > and the ordering is also wrong because of this. Can someone explain this > ? > > > > My custom query parser is pasted here http://pastebin.com/m9f21565 > > > > In the similarity class, I return 10.0 if payload is 1 and 20.0 if > payload > > is 2. For everything else I return 1.0. > > > > { > > 'responseHeader':{ > > 'status':0, > > 'QTime':2, > > 'params':{ > > 'fl':'*,score', > > 'debugQuery':'on', > > 'indent':'on', > > > > > > 'start':'0', > > 'q':'solr', > > 'qt':'aplopio', > > 'wt':'python', > > 'fq':'', > > 'rows':'10'}}, > > 'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[ > > > > > > { > >'payloadTest':'solr|2 rocks|1', > >'id':'2', > >'score':20.0}, > > { > >'payloadTest':'solr|2', > >'id':'4', > >'score':20.0}, > > > > > > { > >'payloadTest':'solr|1 rocks|2', > >'id':'1', > >'score':10.0}, > > { > >'payloadTest':'solr|1 rocks|1', > >'id':'3', > >'score':10.0}, > > > > > > { > >'payloadTest':'solr', > >'id':'5', > >'score':1.0}] > > }, > > 'debug':{ > > 'rawquerystring':'solr', > > 'querystring':'solr', > > > > > > 'parsedquery':'PayloadTermQuery(payloadTest:solr)', > > 'parsedquery_toString':'payloadTest:solr', > > 'explain':{ > > '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1), > product of:\n 14.142136 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=1)\n', > > > > > > '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3), > product of:\n 14.142136 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 1.0 = fieldNorm(field=payloadTest, doc=3)\n', > > > > > > '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0), > product of:\n 7.071068 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=0)\n', > > > > > > '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2), > product of:\n 7.071068 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=2)\n', > > > > > > '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4), > product of:\n 0.70710677 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n1.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 1.0 = fieldNorm(field=payloadTest, doc=4)\n'}, > > > > > > 'QParser':'BoostingTermQParser', > > 'filter_queries':[''], > > 'parsed_filter_queries':[], > > 'timing':{ > > 'time':2.0, > > 'prepare':{ > >'time':1.0, > > > > > >'org.apache.solr.handler.component.QueryComponent':{ > > 'time':1.0}, > >'org.apache.solr.handler.component.FacetComponent':{ > > 'time':0.0}, > >'org.apache.solr.handler.component.MoreLikeThisComponent':{ > > > > > > 'time':0.0}, > >'org.apache.solr.handler.component.HighlightComponent':{ > > 'time':0.0}, > >'org.apache.solr.handler.component.StatsComponen
Re: solr php client vs file_get_contents?
In the end, the PHP client does a file_get_contents for doing a search the same way you'd do it "manually". It's all PHP, so you can do anything it does yourself. It provides what any library of PHP classes should - convenience. I use the JSON response writer because it gets the most attention from the Solr community of all the non-XML writers, yet is still very quick to parse (you might want to do your own tests comparing the speed of unserializing a Solr phps response versus json_decode'ing the json version). Happy Solr'ing, - Donovan On Dec 15, 2009, at 8:49 AM, Faire Mii wrote: i am using php to access solr and i wonder one thing. why should i use solr php client when i can use $serializedResult = file_get_contents('http://localhost:8983/solr/ select?q=niklas&wt=phps'); to get the result in arrays and then print them out? i dont really get the difference. is there any richer features with the php client? regards fayer
Re: solr php client vs file_get_contents?
On Tue, Dec 15, 2009 at 8:49 AM, Faire Mii wrote: > i am using php to access solr and i wonder one thing. > > why should i use solr php client when i can use > > $serializedResult = file_get_contents('http://localhost:8983/solr/ > select?q=niklas&wt=phps'); > > to get the result in arrays and then print them out? > > i dont really get the difference. is there any richer features with the php > client? > > > regards > > fayer Hi Faire, Have you actually used this library before? I think the library is pretty well thought out. >From a simple glance at the source code you can see that one can use it for the following purposes: 1. Adding documents to the index (which you cannot just do with file_get_contents alone). So that's one diff 2. Updating existing documents 3. Deleting existing documents. 4. Balancing requests across multiple backend servers There are other operations with the Solr server that the library can also perform. Some example of what I am referring to is illustrated here http://code.google.com/p/solr-php-client/wiki/FAQ http://code.google.com/p/solr-php-client/wiki/ExampleUsage IBM also has an interesting article illustrating how to add documents to the Solr index and issue commit and optimize calls using this library. http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/ The author of the library can probably give you more details on what the library has to offer. I think you should download the source code and spend some time looking at all the features it has to offer. In my opinion, it is not fair to compare a well thought out library like that with a simple php function. -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
solr php client vs file_get_contents?
i am using php to access solr and i wonder one thing. why should i use solr php client when i can use $serializedResult = file_get_contents('http://localhost:8983/solr/ select?q=niklas&wt=phps'); to get the result in arrays and then print them out? i dont really get the difference. is there any richer features with the php client? regards fayer
Re: search in all fields for multiple values?
On Tue, Dec 15, 2009 at 5:35 PM, Faire Mii wrote: > i have two fields: > > title > body > > and i want to search for two words > > dog > OR > cat > > in each of them. > > i have tried q=*:dog OR cat > > but it doesnt work. > > how should i type it? > > PS. could i enter default search field = ALL fields in schema.xml in > someway? > See http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_.22superman.22_in_both_the_title_and_subject_fields You can also create a copyField to which you can copy both title and body and specify that as the default search field. -- Regards, Shalin Shekhar Mangar.
search in all fields for multiple values?
i have two fields: title body and i want to search for two words dog OR cat in each of them. i have tried q=*:dog OR cat but it doesnt work. how should i type it? PS. could i enter default search field = ALL fields in schema.xml in someway?
Re: Payloads with Phrase queries
The interesting thing I am noticing is that the scoring works fine for a phrase query like "solr rocks". This lead me to look at what query I am using in case of a single term. Turns out that I am using PayloadTermQuery taking a cue from solr-1485 patch. I changed this to BoostingTermQuery (i read somewhere that this is deprecated .. but i was just experimenting) and the scoring seems to work as expected now for a single term. Now, the important question is what is the Payload version of a TermQuery? Regards Raghu On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla < raghuveer.kanche...@aplopio.com> wrote: > Hi, > Thanks everyone for the responses, I am now able to get both phrase queries > and term queries to use payloads. > > However the the score value for each document (and consequently, the > ordering of documents) are coming out wrong. > > In the solr output appended below, document 4 has a score higher than the > document 2 (look at the debug part). The results section shows a wrong score > (which is the payload value I am returning from my custom similarity class) > and the ordering is also wrong because of this. Can someone explain this ? > > My custom query parser is pasted here http://pastebin.com/m9f21565 > > In the similarity class, I return 10.0 if payload is 1 and 20.0 if payload > is 2. For everything else I return 1.0. > > { > 'responseHeader':{ > 'status':0, > 'QTime':2, > 'params':{ > 'fl':'*,score', > 'debugQuery':'on', > 'indent':'on', > > > 'start':'0', > 'q':'solr', > 'qt':'aplopio', > 'wt':'python', > 'fq':'', > 'rows':'10'}}, > 'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[ > > > { >'payloadTest':'solr|2 rocks|1', >'id':'2', >'score':20.0}, > { >'payloadTest':'solr|2', >'id':'4', >'score':20.0}, > > > { >'payloadTest':'solr|1 rocks|2', >'id':'1', >'score':10.0}, > { >'payloadTest':'solr|1 rocks|1', >'id':'3', >'score':10.0}, > > > { >'payloadTest':'solr', >'id':'5', >'score':1.0}] > }, > 'debug':{ > 'rawquerystring':'solr', > 'querystring':'solr', > > > 'parsedquery':'PayloadTermQuery(payloadTest:solr)', > 'parsedquery_toString':'payloadTest:solr', > 'explain':{ > '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1), product > of:\n 14.142136 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=1)\n', > > > '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3), product > of:\n 14.142136 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 1.0 = fieldNorm(field=payloadTest, doc=3)\n', > > > '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0), product > of:\n 7.071068 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=0)\n', > > > '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2), product > of:\n 7.071068 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 0.625 = fieldNorm(field=payloadTest, doc=2)\n', > > > '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4), product > of:\n 0.70710677 = (MATCH) btq, product of:\n0.70710677 = > tf(phraseFreq=0.5)\n1.0 = scorePayload(...)\n 0.81767845 = > idf(payloadTest: solr=5)\n 1.0 = fieldNorm(field=payloadTest, doc=4)\n'}, > > > 'QParser':'BoostingTermQParser', > 'filter_queries':[''], > 'parsed_filter_queries':[], > 'timing':{ > 'time':2.0, > 'prepare':{ >'time':1.0, > > >'org.apache.solr.handler.component.QueryComponent':{ > 'time':1.0}, >'org.apache.solr.handler.component.FacetComponent':{ > 'time':0.0}, >'org.apache.solr.handler.component.MoreLikeThisComponent':{ > > > 'time':0.0}, >'org.apache.solr.handler.component.HighlightComponent':{ > 'time':0.0}, >'org.apache.solr.handler.component.StatsComponent':{ > 'time':0.0}, >'org.apache.solr.handler.component.DebugComponent':{ > > > 'time':0.0}}, > 'process':{ >'time':1.0, >'org.apache.solr.handler.component.QueryComponent':{ > 'time':0.0}, >'org.apache.solr.handler.component.FacetComponent':{ > > > 'time':0.0}, >'org.apache.solr.handler.component.MoreLikeThisComponent':{ > 'time':0.0}, >'org.apache.solr.handler.component.HighlightComponent':{ > 'time':0.0}, > > >'org.apache.solr.handler.component.StatsCom
Re: question regarding dynamic fields
On Mon, Dec 14, 2009 at 1:00 PM, Phanindra Reva wrote: > Hello.., > I have observed that the text or keywords which are being > indexed using dynamicField concept are being searchable only when we > mention field name too while querying.Am I wrong with my observation > or is it the default and can not be changed? I am just wondering if > there is any route to search the text indexed using dynamicFields with > out having to mention the field name in the query. > Thanks. > If you are asking if you can give *_s to search on all dynamic fields ending with "_s" then the answer is no. You must specify the field name. -- Regards, Shalin Shekhar Mangar.
Re: Log of zero result searches
On Tue, Dec 15, 2009 at 2:36 PM, Roland Villemoes wrote: > Yes, correct. > > But to use that - the search client must collect this information whenever > we have "0" results. > I do not want that to be part of the client application (quite hard when > that is SolrJS) - this should be collected server site - on Solr. > Do you know how to do that? > > The number of hits are logged along with each query in INFO level. You can analyze the logs to figure out this stat. -- Regards, Shalin Shekhar Mangar.
Re: I cant get it to work
I've only just started with Solr too. As a newbie, first I'd say forget about trying to "compare" it to your mysql database. It's completely different and performs it's own job in it's own way. You feed a document in, and you store that information in the most efficient manner you can to perform the search and return the results you want. So ask, what do I want to search against? field1 field2 field3 That's what you "feed" into Solr. Then ask, what information do I want to "return" after a search? This determines how you "store" the information you've just "fed" into Solr. Say you want to return: field2 Then you might accept field1, field2, and field3 and merge them together into 1 searchable field called "searchtext". This is what users will search against. Then you'd also have "field2" as another field. field2 (not indexed, stored) searchtext (combination of field1,field2,field2 - indexed, not stored) So then you could search against "searchtext" and return "field2" as the result. Hope that provides some explanation (I know it's basic). From my very limited experience with, Solr is great. My biggest hurdle was getting my head around the fact that it's NOT a relational database (ie. mysql) but a separate tool that you configure in the best way for your "search" and only that. -- View this message in context: http://old.nabble.com/I-cant-get-it-to-work-tp26791099p26792373.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: maximum no of values in multi valued string field
On Tue, Dec 15, 2009 at 3:13 PM, bharath venkatesh < bharath.venkat...@ibibogroup.com> wrote: > Hi , > Is there any limit in no of values stored in a single multi valued > string field ? There is no limit theoretical limit. There are practical limits because your documents are heavier. The document cache stores lucene documents in memory. > if a single multi valued string field contains 1000-2000 string values what > will be effect on query performance (we will be only indexing this field not > storing it ) ? Yes, the more the number of tokens, the longer it may take to search across them. Faceting performance can drop drastically for such large number of values. > is it better to store all the strings in a single text field instead of > multi valued string field. > > It wouldn't make a lot of difference. The XML response may be a bit shorter. In a single field highlighting can cause adjacent terms to be highlighted which you may not want. -- Regards, Shalin Shekhar Mangar.
maximum no of values in multi valued string field
Hi , Is there any limit in no of values stored in a single multi valued string field ? if a single multi valued string field contains 1000-2000 string values what will be effect on query performance (we will be only indexing this field not storing it ) ? is it better to store all the strings in a single text field instead of multi valued string field. Thanks in Advance, Bharath This message is intended only for the use of the addressee and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system.
Re: Auto update with deltaimport
Hi, thanks! I've done it by wrote a scripts to call http://localhost:8080/solr/dataimport?command=delta-import automatically:-) Joel Nylund wrote: > > windows or unix? > > unix - make a shell script and call it from cron > > windows - make a .bat or .cmd file and call it from scheduler > > within the shell scripts/bat files use wget or curl to call the right > import: > > wget -q -O /dev/null > http://localhost:8983/solr/dataimport?command=delta-import > > > Joel > > On Dec 12, 2009, at 1:38 AM, Olala wrote: > >> >> Hi All! >> >> I am developing a search engine using Solr, I was tested full-import >> and >> delta-import command successfully.But now,I want to run delta-import >> automatically with my schedule.So, can anyone help me??? >> >> Thanks & Regards, >> -- >> View this message in context: >> http://old.nabble.com/Auto-update-with-deltaimport-tp26755386p26755386.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > > -- View this message in context: http://old.nabble.com/Auto-update-with-deltaimport-tp26755386p26792041.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query on Cache size.
On Mon, Dec 14, 2009 at 7:17 PM, kalidoss < kalidoss.muthuramalin...@sifycorp.com> wrote: > Hi, > > We have enabled the query result cache, its 512 entries, > > we have calculated the size used for cache : > page size about 1000bytes, (1000*512)/1024/1024 = .48MB > > The query result cache is a map of (q, sort, n) to ordered list of Lucene docids. Assuming queryResultWindowSize iw 20 and an average user does not go beyond 20 results, your memory usage of the values in this map is approx 20*sizeof(int)*512. Add some more for keys, map, references etc. -- Regards, Shalin Shekhar Mangar.
Re: Not able to display search results on Tomcat/Solrj
On Tue, Dec 15, 2009 at 1:07 AM, insaneyogi3008 wrote: > > Hello, > > I am running a simple prg > http://old.nabble.com/file/p26779970/SolrjTest.java SolrjTest.java to get > search results from a remote Solr server , I seem to correctly get back the > number of documents that match my query , but I am not able to display the > search results themselves . > > My question is , is this a known issue? I have attached the test & below is > the sample of the result : > > What is "displayname" and "displayphone"? Are they even in your schema? Print out the SolrDocument object directly and you should see the results. -- Regards, Shalin Shekhar Mangar.
Re: Document model suggestion
On Tue, Dec 15, 2009 at 7:26 AM, caman wrote: > > Appreciate any guidance here please. Have a master-child table between two > tables 'TA' and 'TB' where form is the master table. Any row in TA can have > multiple row in TB. > e.g. row in TA > > id---name > 1---tweets > > TB: > id|ta_id|field0|field1|field2.|field20|created_by > 1|1|value1|value2|value2.|value20|User1 > > > > This works fine and index the data.But all the data for a row in TA gets > combined in one document(not desirable). > I am not clear on how to > > 1) separate a particular row from the search results. > e.g. If I search for 'Android' and there are 5 rows for android in TB for a > particular instance in TA, would like to show them separately to user and > if > the user click on any of the row,point them to an attached URL in the > application. Should a separate index be maintained for each row in TB?TB > can > have millions of rows. > The easy answer is that whatever you want to show as results should be the thing that you index as documents. So if you want to show tweets as results, one document should represent one tweet. Solr is different from relational databases and you should not think about both the same way. De-normalization is the way to go in Solr. > 2) How to protect one user's data from another user. I guess I can keep a > column for a user_id in the schema and append that filter automatically > when > I search through SOLR. Any better alternatives? > > That is usually what people do. The hard part is when some documents are shared across multiple users. > Bear with me if these are newbie questions please, this is my first day > with > SOLR. > > No problem. Welcome to Solr! -- Regards, Shalin Shekhar Mangar.
SV: Log of zero result searches
Yes, correct. But to use that - the search client must collect this information whenever we have "0" results. I do not want that to be part of the client application (quite hard when that is SolrJS) - this should be collected server site - on Solr. Do you know how to do that? Roland -Oprindelig meddelelse- Fra: David Stuart [mailto:david.stu...@progressivealliance.co.uk] Sendt: 15. december 2009 09:33 Til: solr-user@lucene.apache.org Emne: Re: Log of zero result searches The returning XML result tag has a numFound attribute that will report 0 if nothing matches your search criteria David On 15 Dec 2009, at 08:16, Roland Villemoes wrote: > Hi > > Question: How do you log zero result searches? > > I quite important from a business perspective to know what searches > that returns zero/empty results. > Does anybody know a way to get this information? > > Roland Villemoes
Re: Log of zero result searches
The returning XML result tag has a numFound attribute that will report 0 if nothing matches your search criteria David On 15 Dec 2009, at 08:16, Roland Villemoes wrote: Hi Question: How do you log zero result searches? I quite important from a business perspective to know what searches that returns zero/empty results. Does anybody know a way to get this information? Roland Villemoes
Re: I cant get it to work
Hi, The answer is "it depends" ;) If your 10 tables represent an entity e.g a person their address etc the one document entity works But if your 10 tables each represnt a series of entites that you want to surface in your search results separately then make a document for each (I.e it depends on your data). What is your use case? Are you wanting a search index that is able to search on every field in your 10 tables or just a few? Think of it this way if you where creating SQL to pull the data out of the db using joins etc what fields would you grab, do you get multiple rows back because some of you tables have a one to many relationship. Once you have formed that query that is your document minus the duplicate information caused by the rows Cheers David On 15 Dec 2009, at 08:05, Faire Mii wrote: I just cant get it. If i got 10 tables in mysql and they are all related to eachother with foreign keys. Should i have 10 documents in solr? or just one document with rows from all tables in it? i have tried in vain for 2 days now...plz help regards fayer
Log of zero result searches
Hi Question: How do you log zero result searches? I quite important from a business perspective to know what searches that returns zero/empty results. Does anybody know a way to get this information? Roland Villemoes
I cant get it to work
I just cant get it. If i got 10 tables in mysql and they are all related to eachother with foreign keys. Should i have 10 documents in solr? or just one document with rows from all tables in it? i have tried in vain for 2 days now...plz help regards fayer