Re: Deduplication patch not working in nightly build

2009-01-10 Thread Grant Ingersoll
I've seen similar errors when large background merges happen while  
looping in a result set.  See http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/




On Jan 9, 2009, at 12:50 PM, Mark Miller wrote:

Your basically writing segments more often now, and somehow avoiding  
a longer merge I think. Also, likely, deduplication is probably  
adding enough extra data to your index to hit a sweet spot where a  
merge is too long. Or something to that effect - MySql is especially  
sensitive to timeouts when doing a select * on a huge db in my  
testing. I didnt understand your answer on the autocommit - I take  
it you are using it? Or no?


All a guess, but it def points to a merge taking a bit long and  
causing a timeout. I think you can relax the MySql timeout settings  
if that is it.


I'd like to get to the bottom of this as well, so any other info you  
can provide would be great.


- Mark

Marc Sturlese wrote:

Hey Shalin,

In the begining (when the error was appearing) i had  
ramBufferSizeMB32/ramBufferSizeMB

and no maxBufferedDocs set

Now I have:
ramBufferSizeMB32/ramBufferSizeMB
maxBufferedDocs50/maxBufferedDocs

I think taht setting maxBufferedDocs to 50 I am forcing more disk  
writting
than I would like... but at least it works fine (but a bit  
slower,opiously).


I keep saying that the most weird thing is that I don't have that  
problem

using solr1.3, just with the nightly...

Even that it's good that it works well now, would be great if  
someone can

give me an explanation why this is happening


Shalin Shekhar Mangar wrote:


On Fri, Jan 9, 2009 at 9:23 PM, Marc Sturlese
marc.sturl...@gmail.comwrote:



hey there,
I hadn't autoCommit set to true but I have it sorted! The error
stopped
appearing after setting the property maxBufferedDocs in  
solrconfig.xml. I

can't exactly undersand why but it just worked.
Anyway, maxBufferedDocs is deprecaded, would ramBufferSizeMB do  
the same?





What I find strange is this line in the exception:
Last packet sent to the server was 202481 ms ago.

Something took very very long to complete and the connection got  
closed by

the time the next row was fetched from the opened resultset.

Just curious, what was the previous value of maxBufferedDocs and  
what did

you change it to?




--
View this message in context:
http://www.nabble.com/Deduplication-patch-not-working-in-nightly-build-tp21287327p21374908.html
Sent from the Solr - User mailing list archive at Nabble.com.




--
Regards,
Shalin Shekhar Mangar.










--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Re: Beginner: importing own data

2009-01-10 Thread phil cryer
Ah!  That got me going, thanks so much!  I've also created a all_Text
field in my schema where I can dump a bunch of other fields so they're
search-able.  Again, I appreciate all the above replies.

P

On Fri, Jan 9, 2009 at 10:48 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 You were searching for 1899 which is the value of the date field in the
 document you added. You need to specify q=date:1899 to search on the date
 field.

 You can also use the defaultSearchField element in schema.xml to specify
 the field on which you'd like to search if no field name is specified in the
 query. Typically, one creates a catch-all field which copies data from all
 the fields you want to search on.

 http://wiki.apache.org/solr/SchemaXml#head-b80c539a0a01eef8034c3776e49e8fe1c064f496

 Also look at the DisMax queries:

 http://wiki.apache.org/solr/DisMaxRequestHandler

 On Fri, Jan 9, 2009 at 8:35 PM, phil cryer p...@cryer.us wrote:

 Otis
 Thanks for your reply, I wrote out a long email explaining the steps I
 took, and the results, but it was returned by the Solr-user email
 server stamped as spam.  I've put my note on pastebin, you can see it
 here: http://pastebin.cryer.us/pastebin.php?show=m359e2e47

 I'd appreciate any feedback, I know I'm close to getting this working,
 just can't see what I'm missing.

 Thank you

 P

 On Thu, Jan 8, 2009 at 4:19 PM, Otis Gospodnetic
 otis_gospodne...@yahoo.com wrote:
  Phil,
 
  The easiest thing to do at this stage in Solr learning experience is to
  restart Solr (servlet container) and redo the search.  Results shouls start
  showing up then because this will effectively reopen the index.
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
  From: phil cryer p...@cryer.us
  To: solr-user@lucene.apache.org
  Sent: Thursday, January 8, 2009 5:00:29 PM
  Subject: Beginner: importing own data
 
  So I have Solr running, I've run through the tutorials online, can
  import data from the example xml and see the results, so it works!
  Now, I take some xml data I have, convert it over to the add / doc
  type that the demo ones are, run it and find out which fields aren't
  defined in schema.xml, I add them there until they're all there and I
  can finally import my own xml into solr w/o error.  But, when I go to
  query solr, it's not there.  Again, I'm using the same procedure that
  I used on the example xml files, and they did the 'commit' at the end,
  so I'm doing something wrong.
 
  Is that all I need to do, define my fields in schema.xml and then
  import via post.jar?  It seems to work, but no results are ever found
  by solr.  I'm open to trying any debugging or whatever, I need to
  figure this out before I can start learning solr.
 
  Thanks
 
  P
 
 



 --
 Regards,
 Shalin Shekhar Mangar.



Re: UUID field type documentation and ExtractingRequestHandler

2009-01-10 Thread Chris Hostetter
: The UUID field type is not documented on the Wiki.

*many* things are not documented on the wiki ... the javadocs are the 
primary source of info about what fieldtypes and analysis factories and 
such are available.

the wiki docs are primarily a place for people to add tips  tricks or 
extra information about using various features beyond just the simple 
basics of it's options/params.




-Hoss