Re: pagerank??

2012-04-04 Thread Bing Li
According to my knowledge, Solr cannot support this. In my case, I get data by keyword-matching from Solr and then rank the data by PageRank after that. Thanks, Bing On Wed, Apr 4, 2012 at 6:37 AM, Manuel Antonio Novoa Proenza mano...@estudiantes.uci.cu wrote: Hello, I have in my Solr

Re: pagerank??

2012-04-04 Thread Ravish Bhagdev
You might want to look into Nutch and its LinkRank instead of Solr for this. For obtaining such information, you need a crawler to crawl through the links. Not what Solr is meant for. Rav On Wed, Apr 4, 2012 at 8:46 AM, Bing Li lbl...@gmail.com wrote: According to my knowledge, Solr cannot

Re: UTF-8 encoding

2012-04-04 Thread henri
I have finally solved my problem!! Did the following: added two lines in the /browse requestHandler str name=v.propertiesvelocity.properties/str str name=v.contentTypetext/html;charset=UTF-8/str Moved velocity.properties from solr/conf/velocity to solr/conf Not being an expert, I

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Ravish Bhagdev
Updating a single field is not possible in solr. The whole record has to be rewritten. 300 MB is still not that big a file. Have you tried doing the indexing (if its only a one time thing) by giving it ~2 GB or xmx? A single file with that size is strange! May I ask what is it? Rav On Tue,

query time customized boosting

2012-04-04 Thread monmohan
Hi, My index is composed of documents with an author field. My system is a users portal where they can have a friend relationship among each other. When a user searches for documents, I would like to boost score of docs in which author is friend of the user doing the search. Note that the list of

Choosing tokenizer based on language of document

2012-04-04 Thread Prakashganesh, Prabhu
Hi, I have documents in different languages and I want to choose the tokenizer to use for a document based on the language of the document. The language of the document is already known and is indexed in a field. What I want to do is when I index the text in the document, I want to choose

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Mikhail Khludnev
There is https://issues.apache.org/jira/browse/LUCENE-3837 but I suppose it's too far from completion. On Wed, Apr 4, 2012 at 2:48 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: Updating a single field is not possible in solr. The whole record has to be rewritten. 300 MB is still not

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Ravish Bhagdev
Yes, I think there are good reasons why it works like that. Focus of search system is to be efficient on query side at cost of being not that efficient on storage. You must however also note that by default a field's length is limited to 1 words in solrconf.xml which you may also need to

LBHttpSolrServer to query a preferred server

2012-04-04 Thread Martin Grotzke
Hi, we want to use the LBHttpSolrServer (4.0/trunk) and specify a preferred server. Our use case is that for one user request we make several solr requests with some heavy caching (using a custom request handler with a special cache) and want to make sure that the subsequent solr requests are

Commitwithin

2012-04-04 Thread Jens Ellenberg
Hello, I am trying to use commitwithin in Java but there seams to be no commit at all with this option. 1. Example Code: UpdateRequest request = new UpdateRequest(); request.deleteByQuery(fild:value); request.setCommitWithin(1);

Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Mark Miller
On Apr 3, 2012, at 10:35 PM, Jamie Johnson wrote: I haven't personally seen this issue but I have been told by another developer that he ran a deleteByQuery(*:*). This deleted the index, but on restart there was information still in the index. Should this be possible? I had planned to

Re: Commitwithin

2012-04-04 Thread Mark Miller
Solr version? I think that for a while now, deletes where not triggering commitWithin. I think this was recently fixed - if I remember right it will be part of 3.6 and then 4. - Mark Miller lucidimagination.com On Apr 4, 2012, at 10:12 AM, Jens Ellenberg wrote: Hello, I am trying to use

Search for library returns 0 results, but search for marion library returns many results

2012-04-04 Thread Sean Adams-Hiett
This is cross posted on Drupal.org: http://drupal.org/node/1515046 Summary: I have a fairly clean install of Drupal 7 with Apachesolr-1.0-beta18. I have created a content type called document with a number of fields. I am working with 30k+ records, most of which are related to Marion, IA in some

RE: Search for library returns 0 results, but search for marion library returns many results

2012-04-04 Thread Joshua Sumali
Did you try to append debugQuery=on to get more information? -Original Message- From: Sean Adams-Hiett [mailto:s...@advantage-companies.com] Sent: Wednesday, April 04, 2012 10:43 AM To: solr-user@lucene.apache.org Subject: Search for library returns 0 results, but search for marion

Re: Search for library returns 0 results, but search for marion library returns many results

2012-04-04 Thread Ravish Bhagdev
Yes, can you check if results you get with marion library match on marion or library? By default solr uses OR between words (specified in solrconfig.xml). You can also easily check this by enabling highlighting. Ravish On Wed, Apr 4, 2012 at 4:11 PM, Joshua Sumali jsum...@kobo.com wrote: Did

Re: PageRank

2012-04-04 Thread Manuel Antonio Novoa Proenza
hi Rav Thank you for your answer. In my case I use nutch for crawling the web. Using nutch am a true rookie. How do I configure nutch to return that information? And how do I make solr to index that information, or that information is being built with the score of the indexed documents. thank

Re: PageRank

2012-04-04 Thread Markus Jelsma
Hi, Please subscribe to the Nutch mailing list. Scoring is straightforward and calculated scores can be written to the CrawlDB or as external file field for Solr. Cheers On Wed, 04 Apr 2012 10:22:46 -0500 (COT), Manuel Antonio Novoa Proenza mano...@estudiantes.uci.cu wrote: hi Rav Thank

Re: Search for library returns 0 results, but search for marion library returns many results

2012-04-04 Thread Sean Adams-Hiett
Here are some of the XML results with the debug on: response result name=response numFound=0 start=0/ lst name=highlighting/ lst name=debug str name=rawquerystringlibrary/str str name=querystringlibrary/str str name=parsedquery +DisjunctionMaxQuery((content:librari)~0.01)

Evaluating Solr

2012-04-04 Thread Joseph Werner
Hi, I'm evaluating Solr for use in a project. In the Solr FAQ under How can I rebuild my index from scratch if I change my schema? After restarting the server, step 5 is to Re-Index your data no mention is made of how this is done. For more routine changes, are record updates supported without

Re: Evaluating Solr

2012-04-04 Thread Glen Newton
Re-Index your data ~= Reload your data On Wed, Apr 4, 2012 at 12:46 PM, Joseph Werner telco...@gmail.com wrote: Hi, I'm evaluating Solr for use in a project. In the Solr FAQ under How can I rebuild my index from scratch if I change my schema?  After restarting the server, step  5 is to

Re: Evaluating Solr

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 12:46 PM, Joseph Werner telco...@gmail.com wrote: For more routine changes, are record updates supported without the necessitity to rebuilt an index? For example if a description field for an item needs be changed, am I correct in reading that the recodrd need only be

JNDI in db-data-config.xml websphere

2012-04-04 Thread tech20nn
I am trying to use jndiName attribute in db-data-config.xml. This works great in tomcat. However having issues in websphere. Following exception is thrown Make sure that a J2EE application does not execute JNDI operations on java: names within static code blocks or in threads created by that

Re: Does any one know when Solr 4.0 will be released.

2012-04-04 Thread Darren Govoni
No one knows. But if you ask the devs, they will say 'when its done'. One clue might be to monitor the bugs/issues scheduled for 4.0. When they are all resolved, then its ready. On Wed, 2012-04-04 at 09:41 -0700, srinivas konchada wrote: Hello every one Does any one know when Solr 4.0 will be

Re: UTF-8 encoding

2012-04-04 Thread Erik Hatcher
Apologies for not replying sooner on this thread, I just noticed it today... To add insight into where velocity.properties can reside, it is used this way in VelocityResponseWriter.java: SolrVelocityResourceLoader resourceLoader = new

Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Jamie Johnson
Thanks Mark. The delete by query is a very rare operation for us and I really don't have the liberty to update to current trunk right now. Do you happen to know about when the fix was made so I can see if we are before or after that time? On Wed, Apr 4, 2012 at 10:25 AM, Mark Miller

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread vybe3142
Thanks. Increasing max. heap space is not a scalable option as it reduces the ability of the system to scale with multiple concurrent index requests. The use case is indexing a set of text files which we have no control over i.e. could be small or large. -- View this message in context:

Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:04 PM, Jamie Johnson jej2...@gmail.com wrote: Thanks Mark.  The delete by query is a very rare operation for us and I really don't have the liberty to update to current trunk right now. Do you happen to know about when the fix was made so I can see if we are before or

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread vybe3142
Updating a single field is not possible in solr. The whole record has to be rewritten. Unfortunate. Lucene allows it. -- View this message in context: http://lucene.472066.n3.nabble.com/Incrementally-updating-a-VERY-LARGE-field-Is-this-possibe-tp3881945p3885253.html Sent from the Solr -

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:14 PM, vybe3142 vybe3...@gmail.com wrote: Updating a single field is not possible in solr.  The whole record has to be rewritten. Unfortunate. Lucene allows it. I think you're mistaken - the same limitations apply to Lucene. -Yonik lucenerevolution.com - Lucene/Solr

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Walter Underwood
I believe we are talking about two different things. The original question was about incrementally building up a field during indexing, right? After a document is committed, a field cannot be separately updated, that is true in both Lucene and Solr. wunder On Apr 4, 2012, at 12:20 PM, Yonik

SOLRCloud on appserver

2012-04-04 Thread SOLRUSER
Does anyone have any instructions on setting up SOLRCloud on multiple appservers? Ideally a wiki, blog, step-by-step guide I can follow.

Re: space making it hard tu use wilcard with lucene parser

2012-04-04 Thread jmlucjav
thanks, that will work I think -- View this message in context: http://lucene.472066.n3.nabble.com/space-making-it-hard-tu-use-wilcard-with-lucene-parser-tp3882534p3885460.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

2012-04-04 Thread Tommaso Teofili
Hi again Chris, I finally manage to find some proper time to test your configuration. First thing to notice is that it worked for me assuming the following pre-requisites were satisfied: - you had the jar containing the AnalysisEngine for the RoomAnnotator.xml in your libraries section (this is

Re: Using UIMA in Solr behind a firewall

2012-04-04 Thread Tommaso Teofili
Hello Peter, I think that is more related to UIMA AlchemyAPIAnnotator [1] or to AlchemyAPI services themselves [2] because Solr just use the out of the box UIMA AnalysisEngine for that. Thus it may make sense to ask on d...@uima.apache.org (or even directly to AlchemyAPI guys). HTH, Tommaso [1]

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread jmlucjav
depending on you jvm version, -XX:+UseCompressedStrings would help alleviate the problem. It did help me before. xab -- View this message in context: http://lucene.472066.n3.nabble.com/Incrementally-updating-a-VERY-LARGE-field-Is-this-possibe-tp3881945p3885493.html Sent from the Solr - User

RE: Distributed grouping issue

2012-04-04 Thread Young, Cody
Hi Martijn, I created a JIRA issue and attached a test that fails. It seems to exhibit the same issue that I see on my local box. (If you run it multiple times you can see that the group value of the top doc changes between runs.) Also, I had to change add fixShardCount = true; in the

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread vybe3142
Yonik Seeley-2-2 wrote On Wed, Apr 4, 2012 at 3:14 PM, vybe3142 lt;vybe3142@gt; wrote: Updating a single field is not possible in solr.  The whole record has to be rewritten. Unfortunate. Lucene allows it. I think you're mistaken - the same limitations apply to Lucene. -Yonik

waitFlush and waitSearcher with SolrServer.add(docs, commitWithinMs)

2012-04-04 Thread Mike O'Leary
If you index a set of documents with SolrJ and use StreamingUpdateSolrServer.add(CollectionSolrInputDocument docs, int commitWithinMs), it will perform a commit within the time specified, and it seems to use default values for waitFlush and waitSearcher. Is there a place where you can specify

Re: waitFlush and waitSearcher with SolrServer.add(docs, commitWithinMs)

2012-04-04 Thread Mark Miller
On Apr 4, 2012, at 6:50 PM, Mike O'Leary wrote: If you index a set of documents with SolrJ and use StreamingUpdateSolrServer.add(CollectionSolrInputDocument docs, int commitWithinMs), it will perform a commit within the time specified, and it seems to use default values for waitFlush and

Re: LBHttpSolrServer to query a preferred server

2012-04-04 Thread Martin Grotzke
Hi, I just submitted an issue with patch for this: https://issues.apache.org/jira/browse/SOLR-3318 Cheers, Martin On 04/04/2012 03:53 PM, Martin Grotzke wrote: Hi, we want to use the LBHttpSolrServer (4.0/trunk) and specify a preferred server. Our use case is that for one user request we

RE: waitFlush and waitSearcher with SolrServer.add(docs, commitWithinMs)

2012-04-04 Thread Mike O'Leary
I am indexing some database contents using add(docs, commitWithinMs), and those add calls are taking over 80% of the time once the database begins returning results. I was wondering if setting waitSearcher to false would speed this up. Many of the calls take 1 to 6 seconds, with one outlier

Is there any performance cost of using lots of OR in the solr query

2012-04-04 Thread roz dev
Hi All, I am working on an application which makes few solr calls to get the data. On the high level, We have a requirement like this - Make first call to Solr, to get the list of products which are children of a given category - Make 2nd solr call to get product documents based on a

Duplicates in Facets

2012-04-04 Thread Jamie Johnson
I am currently indexing some information and am wondering why I am getting duplicates in facets. From what I can tell they are the same, but is there any case that could cause this that I may not be thinking of? Could this be some non printable character making it's way into the index? Sample

Re: Duplicates in Facets

2012-04-04 Thread Darren Govoni
Try using Luke to look at your index and see if there are multiple similar TFV's. You can browse them easily in Luke. On Wed, 2012-04-04 at 23:35 -0400, Jamie Johnson wrote: I am currently indexing some information and am wondering why I am getting duplicates in facets. From what I can tell

Re: Duplicates in Facets

2012-04-04 Thread Jamie Johnson
Yes, thanks for the reply. Turns out there is whitespace differences in these fields, thank you for the quick reply! On Wed, Apr 4, 2012 at 11:45 PM, Darren Govoni dar...@ontrenet.com wrote: Try using Luke to look at your index and see if there are multiple similar TFV's. You can browse them

Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Jamie Johnson
My snapshot was taken 2/27. That would seem to indicate that the deleteByQuery should be getting versioned, I am not sure if the other issues that were resolved would change the operation. I'll keep an eye on it and if it pops up I'll try to push the update. Thanks. On Wed, Apr 4, 2012 at 3:12

Re: SolrCloud replica and leader out of Sync somehow

2012-04-04 Thread Jamie Johnson
Not sure if this got lost in the shuffle, were there any thoughts on this? On Wed, Mar 21, 2012 at 11:02 AM, Jamie Johnson jej2...@gmail.com wrote: Given that in a distributed environment the docids are not guaranteed to be the same across shards should the sorting use the uniqueId field as

alt attribute img tag

2012-04-04 Thread Manuel Antonio Novoa Proenza
Hello, I would like to know the method of extracting from the images that are in html documents Alt attribute data 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu