Re: field collapse using 'adjacent' 'includeCollapsedDocs' + 'sort' query field

2009-11-15 Thread michael8

Hi Martijn,

Thanks for your insight of collapsedDocs, and what I need to modify if I
need the functionality I want.

Michael


Martijn v Groningen wrote:
 
 Hi Micheal,
 
 What you are saying seems logical, but that is currently not the case
 with the collapsedDocs functionality. This functionality was build
 with computing aggregated statistics in mind and not really to have a
 separate collapse group search result. Although the collapsed
 documents are collected in the order the appear in the search result
 (only if collapsetype is adjacent) they are not saved in the order
 they appear.
 
 If you really need to have the collapse group search result in the
 order they were collapsed you need to tweak the code. What you can do
 is change the CollapsedDocumentCollapseCollector class in the
 DocumentFieldsCollapseCollectorFactory.java source file. Currently the
 document ids are stored inside a OpenBitSet per collapse group. You
 can change that into an ArrayListInteger for example. In this way
 the order in where the documents were collapsed is preserved.
 
 I think the downside of this change will be to increase of memory
 usage. OpenBitSet is memory wise more efficient then an ArrayList of
 integers. I think that this will only be a real problem when the
 collapse groups become very large.
 
 I hope this will answer your question.
 
 Martijn
 
 2009/11/14 michael8 mich...@saracatech.com:

 Hi,

 This almost seems like a bug, but I can't be sure so I'm seeking
 confirmation.  Basically I am building a site that presents search
 results
 in reverse chronologically order.  I am also leveraging the field
 collapse
 feature so that I can group results using 'adjacent' mode and have solr
 return the collapsed results as well via 'includeCollapsedDocs'.  My
 collapsing field is a custom grouping_id that I have specified.

 What I'm noticing is that, my search results are coming back in the
 correct
 order by descending time (via 'sort' param in the main query) as
 expected.
 However, the results returned within the 'collapsedDocs' section via
 'includeCollapsedDocs' are not in the same descending time order.

 My question is, shouldn't the collapsedDocs results also be in the same
 'sort' order and key I have specified in the overall query, particularly
 since 'adjacent' mode is enabled, and that would mean results that are
 'adjacent' in the sort order of the results.

 I'm using Solr 1.4.0 + field collapse patch as of 10/27/2009

 Thanks,
 Michael

 --
 View this message in context:
 http://old.nabble.com/field-collapse-using-%27adjacent%27---%27includeCollapsedDocs%27-%2B-%27sort%27-query-field-tp26351840p26351840.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://old.nabble.com/field-collapse-%27includeCollapsedDocs%27-doesn%27t-return-results-within-%27collapsedDocs%27-in-%27sort%27-order-specified-tp26351840p26360433.html
Sent from the Solr - User mailing list archive at Nabble.com.



solr stops running periodically

2009-11-15 Thread athir nuaimi
We have 4 machines running solr.  On one of the machines, every 2-3  
days solr stops running.  By that I mean that the java/tomcat  
process just disappears.  If I look at the catalina logs, I see  
normal log entries and then nothing.  There is no shutdown messages  
like you would normally see if you sent a SIGTERM to the process.


Obviously this is a problem. I''m new to solr/java so if there are  
more diagnostic things I can do I'd appreciate any tips/advice.


thanks in advance
Athir




Re: Spell check suggestion and correct way of implementation and some Questions

2009-11-15 Thread Shalin Shekhar Mangar
On Wed, Nov 4, 2009 at 12:31 AM, darniz rnizamud...@edmunds.com wrote:


 Thanks

 i included the  buildOncommit and buildOnOptimize as true and indexed some
 documents and it automatically builds the dictionary.

 Are there any performance issues we should be aware of, with this approach.


Well, it depends. Each commit/optimize will re-create the spell check index
with those options. So, it is best if you test it out with your index,
queries and load.


-- 
Regards,
Shalin Shekhar Mangar.


Newbie tips: migrating from mysql fulltext search / PHP integration

2009-11-15 Thread mbneto
Hi,

I am looking for alternatives to MySQL fulltext searches.  The combo
Lucene/Solr is one of my options and I'd like to gather as much information
I can before choosing and even build a prototype.

My current need does not seem to be different.

- fast response time (currently some searches can take more than 11sec)
- API to add/update/delete documents to the collection
- way to add synonymous or similar words for misspelled ones (ex. Sony =
Soni)
- way to define relevance of results (ex. If I search for LCD return
products that belong to the LCD category, contains LCD in the product
definition or ara marked as special offer)

I know that I may have to add external code, for example, to take the
results and apply some business logic to resort the results but I'd like to
know, besides the wiki and the solr 1.4 Enterprise Seacrh Server book (which
I am considering to buy) the tips for solr usage.


[OT] Webinar on spatial search using Lucene and Solr

2009-11-15 Thread Grant Ingersoll
From Here to There, You Can Find it Anywhere:
 Building Local/Geo-Search
 with Apache Lucene and Solr
 
Join us for a free webinar hosted by TechTarget / TheServerSide.com
 Wednesday, November 18th 2009
 10:00 AM PST / 1:00 PM EST
 
Click here to sign up
 http://theserversidecom.bitpipe.com/detail/RES/1257457967_42.htmlamp;asrc=CL_PRM_Lucid_11_18_09_camp;li=252934
 
With new advances in the flexibility and customizability of Apache Lucene/Solr 
open source search, the ubiquity of location-aware devices and vast amounts of 
spatial data, tremendous opportunities open up to deliver more powerful and 
effective geo-aware search results.
 
We'll hear from Grant Ingersoll, co-founder of Lucid Imagination and chairman 
of the Apache Lucene PMC, for an in-depth technical workshop on the potential 
and application of the newly released Lucene and Solr geo-search functions. 
Grant will be joined by thought leaders: Ryan McKinley, co-founder of Voyager 
GIS and Apache Lucene PMC member; and Sameer Maggon, of ATT Interactive, which 
manages and delivers online and mobile advertising products across ATT's media 
platforms.
 
 Features and benefits of using spatial data in a search engine 
  Representing and leveraging spatial data in Lucene to empower Local Search 
  Spatial search in action, a peek at Voyager GIS, a tool to index and search 
  geographic data 
  How ATT Interactive uses Solr/Lucene to power local search at YP.com
 
Click here to sign up
 http://theserversidecom.bitpipe.com/detail/RES/1257457967_42.htmlamp;asrc=CL_PRM_Lucid_11_18_09_camp;li=252934

 About the presenters:
 
Grant Ingersoll
 Co-founder of Lucid Imagination
 Grant Ingersoll, co-founder of Lucid Imagination, is a published expert in 
 search and Natural Language Processing, with many articles published on 
 Lucene, Solr, findability, relevance, and is co-founder of the Apache Mahout 
 machine learning project. Grant's the author of the forthcoming book Taming 
 Text, from Manning publications.
 
Ryan McKinley
 Co-founder of Voyager GIS
 Ryan McKinley, co-founder of Voyager GIS, works with technology to help find, 
 share, and distribute information. He has built many sites using solr, 
 including: ludb.clui.org andwww.digitalcommonwealth.org. He was a partner at 
 Squid Labs and co-founded www.instructables.com. Ryan is a member of Lucid 
 Imagination's Technical Advisory Board.
 
Sameer Maggon
 ATT Interactive
 Sameer Maggon leads the Search Engineering Team at ATT Interactive. He 
 helped the company launch YP.com that uses Solr underneath. Before joining 
 ATT Interactive, he worked with Siderean (http://www.siderean.com) working 
 on an enterprise search and navigation product that used Lucene and was 
 ultimately responsible for delivering the technology behind their new 
 product. Sameer has been been an active Lucene user since 2001.


Re: solr stops running periodically

2009-11-15 Thread Grant Ingersoll
Have you looked in other logs, like your syslogs?  I've never seen Solr/Tomcat 
just disappear w/o so much as a blip.  I'd think if a process just died from an 
error condition there would be some note of it somewhere.  I'd try to find some 
other events taking place at that time which might give a hint.

On Nov 15, 2009, at 1:45 PM, athir nuaimi wrote:

 We have 4 machines running solr.  On one of the machines, every 2-3 days 
 solr stops running.  By that I mean that the java/tomcat process just 
 disappears.  If I look at the catalina logs, I see normal log entries and 
 then nothing.  There is no shutdown messages like you would normally see if 
 you sent a SIGTERM to the process.
 
 Obviously this is a problem. I''m new to solr/java so if there are more 
 diagnostic things I can do I'd appreciate any tips/advice.
 
 thanks in advance
 Athir
 




RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me 
org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
figure out correct segment file name. (May be index replication issue -- 
leading to not fully replicated.. but its so hard to believe as both master 
and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you 
know.


Thanks
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) master 
 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Maduranga Kannangara
  To: solr-user@lucene.apache.org
  Sent: Mon, November 9, 2009 7:47:04 PM
  Subject: RE: Segment file not found error - after replicating
 
  Thanks Otis!
 
  Yes, I checked the index directories and they are 100% same, both timestamp
 and
  size wise.
 
  Not all the slaves face this issue. I would say roughly 50% has this 
  trouble.
 
  Logs do not have any errors too :-(
 
  Any other things I should do/look at?
 
  Cheers
  Madu
 
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
  Sent: Tuesday, 10 November 2009 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Segment file not found error - after replicating
 
  It's hard to troubleshoot blindly like this, but have you tried manually
  comparing the contents of the index dir on the master and on the slave(s)?
  If they are out of sync, have you tried forcing of replication to see if one
 of
  the subsequent replication attempts gets the dirs in sync?
  Do you have more than 1 slave and do they all start having this problem at 
  the
  same time?
  Any errors in the logs for any of the scripts involved in replication in 
  1.3?
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Maduranga Kannangara
   To: solr-user@lucene.apache.org
   Sent: Sun, November 8, 2009 10:30:44 PM
   Subject: Segment file not found error - after replicating
  
   Hi guys,
  
   We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
   environment and use the replication scripts to make replicas those live in
  load
   balancing slaves.
  
   The issue we face quite often (only in Linux servers) is that they tend to
 not
 
   been able to find the segment file (segment_x etc) after the replicating
   completed. As this has become quite common, we started hitting a serious
  issue.
  
   Below is a stack trace, if that helps and any help on this matter is 
   greatly
   appreciated.
  
   
  
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
 load
   INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
 

RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Just found out the root cause:

* The segments.gen file does not get replicated to slave all the time.

For some reason, this small (20bytes) file lives in memory and does not get 
updated to the master's hard disk. Therefore it is not obviously transferred to 
slaves.

Solution was to shut down the master web app (must be a clean shut down!, not 
kill of Tomcat). Then do the replication.

Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
does not seem to copy over this file too. So enforcing in the replication 
scripts solved the problem.

Thanks Otis and everyone for all your support!

Madu


-Original Message-
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 12:37 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me 
org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
figure out correct segment file name. (May be index replication issue -- 
leading to not fully replicated.. but its so hard to believe as both master 
and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you 
know.


Thanks
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) master 
 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Maduranga Kannangara
  To: solr-user@lucene.apache.org
  Sent: Mon, November 9, 2009 7:47:04 PM
  Subject: RE: Segment file not found error - after replicating
 
  Thanks Otis!
 
  Yes, I checked the index directories and they are 100% same, both timestamp
 and
  size wise.
 
  Not all the slaves face this issue. I would say roughly 50% has this 
  trouble.
 
  Logs do not have any errors too :-(
 
  Any other things I should do/look at?
 
  Cheers
  Madu
 
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
  Sent: Tuesday, 10 November 2009 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Segment file not found error - after replicating
 
  It's hard to troubleshoot blindly like this, but have you tried manually
  comparing the contents of the index dir on the master and on the slave(s)?
  If they are out of sync, have you tried forcing of replication to see if one
 of
  the subsequent replication attempts gets the dirs in sync?
  Do you have more than 1 slave and do they all start having this problem at 
  the
  same time?
  Any errors in the logs for any of the scripts involved in replication in 
  1.3?
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Maduranga Kannangara
   To: solr-user@lucene.apache.org
   Sent: Sun, November 8, 

Re: Segment file not found error - after replicating

2009-11-15 Thread Mark Miller
Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, not 
 kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
 does not seem to copy over this file too. So enforcing in the replication 
 scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both master 
 and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
   
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index 
 is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) 
 master 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
 
 From: Maduranga Kannangara
 To: solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis!

 Yes, I checked the index directories and they are 100% same, both timestamp
   
 and
 
 size wise.

 Not all the slaves face this issue. I would say roughly 50% has this 
 trouble.

 Logs do not have any errors too :-(

 Any other things I should do/look at?

 Cheers
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 9:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It's hard to troubleshoot blindly like this, but have you tried manually
 comparing the contents of the index dir on the master and on the slave(s)?
 If they are out of sync, have you tried forcing of replication to see if one
   
 of
 
 the subsequent replication attempts gets the dirs in sync?
 Do you have more than 1 slave and do they all start having this problem at 
 the
 same time?
 Any errors in the logs for any of the scripts involved in replication in 
 1.3?

 Otis
 --
 

RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Yes, I too believed so..

The logic in earlier said method does the gen number calculation using 
segment files available (genA) and using segment.gen file content (genB). Which 
ever larger, would be the gen number used to look up for segment file.

When the file is not properly replicated (due to that is not being written to 
hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) 
is larger than the file based calculation (genA) we hit the pre-said issue.

Cheers
Madu


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Monday, 16 November 2009 2:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, not 
 kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
 does not seem to copy over this file too. So enforcing in the replication 
 scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both master 
 and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index 
 is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) 
 master 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 From: Maduranga Kannangara
 To: solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis!

 Yes, I checked the index directories and they are 100% same, both timestamp

 and

 size wise.

 Not all the slaves face this issue. I would say roughly 50% has this 
 trouble.

 Logs do not have any errors too :-(

 Any other things I should do/look at?

 Cheers
 Madu


 -Original Message-
 

Re: Newbie Solr questions

2009-11-15 Thread Peter Wolanin
Take a look at the example schema - you can have dynamic fields that
are used based on wildcard matching to the field name if a field
doesn't mtach the name of an existing field.

-Peter

On Sun, Nov 15, 2009 at 10:50 AM, yz5od2 woods5242-outdo...@yahoo.com wrote:
 Thanks for the reply:

 I follow the schema.xml concept, but what if my requirement is more dynamic
 in nature? I.E. I would like my developers to be able to annotate a POJO and
 submit it to the Solr server (embedded) to be indexed according to public
 properties OR annotations. Is that possible?

 If that is not possible, can I programatically define documents and fields
 (and the field options) in straight Java? I.E. in pseudo code below...

 // this is made up but this is what I would like to be able to do
 SolrDoc document = new SolrDoc();
 SolrField field = new SolrField()
 field.isIndexed=true;
 field.isStored=true;
 field.name = 'myField'

 field.value = myPOJO.getValue();

 solrServer.index(document);





 On Nov 15, 2009, at 12:50 AM, Avlesh Singh wrote:


 a) Since Solr is built on top of lucene, using SolrJ, can I still
 directly
 create custom documents, specify the field specifics etc (indexed, stored
 etc) and then map POJOs to those documents, simular to just using the
 straight lucene API?

 b) I took a quick look at the SolrJ javadocs but did not see anything in
 there that allowed me to customize if a field is stored, indexed, not
 indexed etc. How do I do that with SolrJ without having to go directly to
 the lucene apis?

 c) The SolrJ beans package. By annotating a POJO with @Field, how exactly
 does SolrJ treat that field? Indexed/stored, or just indexed? Is there
 any
 other way to control this?

 The answer to all your questions above is the magical file called
 schema.xml. For more read here - http://wiki.apache.org/solr/SchemaXml.
 SolrJ is simply a java client to access (read and update from) the solr
 server.

 c) If I create a custom index outside of Solr using straight lucene, is it

 easy to import a pre-exisiting lucene index into a Solr Server?

 As long as the Lucene index matches the definitions in your schema you can
 use the same index. The data however needs to copied into a predictable
 location inside SOLR_HOME.

 Cheers
 Avlesh

 On Sun, Nov 15, 2009 at 9:26 AM, yz5od2
 woods5242-outdo...@yahoo.comwrote:

 Hi,
 I am new to Solr but fairly advanced with lucene.

 In the past I have created custom Lucene search engines that indexed
 objects in a Java application, so my background is coming from this
 requirement

 a) Since Solr is built on top of lucene, using SolrJ, can I still
 directly
 create custom documents, specify the field specifics etc (indexed, stored
 etc) and then map POJOs to those documents, simular to just using the
 straight lucene API?

 b) I took a quick look at the SolrJ javadocs but did not see anything in
 there that allowed me to customize if a field is stored, indexed, not
 indexed etc. How do I do that with SolrJ without having to go directly to
 the lucene apis?

 c) The SolrJ beans package. By annotating a POJO with @Field, how exactly
 does SolrJ treat that field? Indexed/stored, or just indexed? Is there
 any
 other way to control this?

 c) If I create a custom index outside of Solr using straight lucene, is
 it
 easy to import a pre-exisiting lucene index into a Solr Server?

 thanks!






-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


Re: Newbie Solr questions

2009-11-15 Thread yz5od2
ok, so what I am hearing, there is no way to create custom documents/ 
fields via the SolrJ client @ runtime. Instead you have to use the  
schema.xml ahead of time OR create a custom index via the lucene APIs  
then import the indexes into Solr for searching?




On Nov 15, 2009, at 9:16 PM, Peter Wolanin wrote:


Take a look at the example schema - you can have dynamic fields that
are used based on wildcard matching to the field name if a field
doesn't mtach the name of an existing field.

-Peter

On Sun, Nov 15, 2009 at 10:50 AM, yz5od2 woods5242- 
outdo...@yahoo.com wrote:

Thanks for the reply:

I follow the schema.xml concept, but what if my requirement is more  
dynamic
in nature? I.E. I would like my developers to be able to annotate a  
POJO and
submit it to the Solr server (embedded) to be indexed according to  
public

properties OR annotations. Is that possible?

If that is not possible, can I programatically define documents and  
fields
(and the field options) in straight Java? I.E. in pseudo code  
below...


// this is made up but this is what I would like to be able to do
SolrDoc document = new SolrDoc();
SolrField field = new SolrField()
field.isIndexed=true;
field.isStored=true;
field.name = 'myField'

field.value = myPOJO.getValue();

solrServer.index(document);





On Nov 15, 2009, at 12:50 AM, Avlesh Singh wrote:



a) Since Solr is built on top of lucene, using SolrJ, can I still
directly
create custom documents, specify the field specifics etc  
(indexed, stored
etc) and then map POJOs to those documents, simular to just using  
the

straight lucene API?

b) I took a quick look at the SolrJ javadocs but did not see  
anything in
there that allowed me to customize if a field is stored, indexed,  
not
indexed etc. How do I do that with SolrJ without having to go  
directly to

the lucene apis?

c) The SolrJ beans package. By annotating a POJO with @Field, how  
exactly
does SolrJ treat that field? Indexed/stored, or just indexed? Is  
there

any
other way to control this?


The answer to all your questions above is the magical file called
schema.xml. For more read here - http://wiki.apache.org/solr/SchemaXml 
.
SolrJ is simply a java client to access (read and update from) the  
solr

server.

c) If I create a custom index outside of Solr using straight  
lucene, is it


easy to import a pre-exisiting lucene index into a Solr Server?

As long as the Lucene index matches the definitions in your schema  
you can
use the same index. The data however needs to copied into a  
predictable

location inside SOLR_HOME.

Cheers
Avlesh

On Sun, Nov 15, 2009 at 9:26 AM, yz5od2
woods5242-outdo...@yahoo.comwrote:


Hi,
I am new to Solr but fairly advanced with lucene.

In the past I have created custom Lucene search engines that  
indexed

objects in a Java application, so my background is coming from this
requirement

a) Since Solr is built on top of lucene, using SolrJ, can I still
directly
create custom documents, specify the field specifics etc  
(indexed, stored
etc) and then map POJOs to those documents, simular to just using  
the

straight lucene API?

b) I took a quick look at the SolrJ javadocs but did not see  
anything in
there that allowed me to customize if a field is stored, indexed,  
not
indexed etc. How do I do that with SolrJ without having to go  
directly to

the lucene apis?

c) The SolrJ beans package. By annotating a POJO with @Field, how  
exactly
does SolrJ treat that field? Indexed/stored, or just indexed? Is  
there

any
other way to control this?

c) If I create a custom index outside of Solr using straight  
lucene, is

it
easy to import a pre-exisiting lucene index into a Solr Server?

thanks!








--
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com





Re: solr stops running periodically

2009-11-15 Thread Otis Gospodnetic
Look for the HotSpot dump files that Sun's Java leaves on disk when it dies.  I 
think their names start with hs.  Luckily, I don't have any of them handy to 
tell you the exact name pattern.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Grant Ingersoll gsing...@apache.org
 To: solr-user@lucene.apache.org
 Sent: Sun, November 15, 2009 8:15:47 PM
 Subject: Re: solr stops running periodically
 
 Have you looked in other logs, like your syslogs?  I've never seen 
 Solr/Tomcat 
 just disappear w/o so much as a blip.  I'd think if a process just died from 
 an 
 error condition there would be some note of it somewhere.  I'd try to find 
 some 
 other events taking place at that time which might give a hint.
 
 On Nov 15, 2009, at 1:45 PM, athir nuaimi wrote:
 
  We have 4 machines running solr.  On one of the machines, every 2-3 days 
  solr 
 stops running.  By that I mean that the java/tomcat process just disappears.  
 If 
 I look at the catalina logs, I see normal log entries and then nothing.  
 There 
 is no shutdown messages like you would normally see if you sent a SIGTERM to 
 the 
 process.
  
  Obviously this is a problem. I''m new to solr/java so if there are more 
 diagnostic things I can do I'd appreciate any tips/advice.
  
  thanks in advance
  Athir
  



Re: Newbie tips: migrating from mysql fulltext search / PHP integration

2009-11-15 Thread Otis Gospodnetic
Hi,

I'm not sure if you have a specific question there.
But regarding PHP integration part, I just learned PHP now has native Solr 
(1.3 and 1.4) support:

  http://twitter.com/otisg/status/5757184282


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: mbneto mbn...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sun, November 15, 2009 4:56:15 PM
 Subject: Newbie tips: migrating from mysql fulltext search / PHP integration
 
 Hi,
 
 I am looking for alternatives to MySQL fulltext searches.  The combo
 Lucene/Solr is one of my options and I'd like to gather as much information
 I can before choosing and even build a prototype.
 
 My current need does not seem to be different.
 
 - fast response time (currently some searches can take more than 11sec)
 - API to add/update/delete documents to the collection
 - way to add synonymous or similar words for misspelled ones (ex. Sony =
 Soni)
 - way to define relevance of results (ex. If I search for LCD return
 products that belong to the LCD category, contains LCD in the product
 definition or ara marked as special offer)
 
 I know that I may have to add external code, for example, to take the
 results and apply some business logic to resort the results but I'd like to
 know, besides the wiki and the solr 1.4 Enterprise Seacrh Server book (which
 I am considering to buy) the tips for solr usage.



Re: Is there a way to skip cache for a query

2009-11-15 Thread Otis Gospodnetic
I don't think that is supported today.  It might be useful, though (e.g. 
something I'd use with an external monitoring service, so that it doesn't 
always get fast results from the cache).


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Bertie Shen bertie.s...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, November 14, 2009 9:43:25 PM
 Subject: Is there a way to skip cache for a query
 
 Hey,
 
   I do not want to disable cache completely by changing the setting in
 solrconfig.xml. I just want to sometimes skip cache for a query for  testing
 purpose. So is there a parameter like skipcache=true to specify in
 select/?q=hotversion=2.2start=0rows=10skipcache=true to skip cache for
 the query [hot]. skipcache can by default be false.
 
 Thanks.



Re: converting over from sphinx

2009-11-15 Thread Otis Gospodnetic
Something doesn't sound right here.  Why do you need wildcards for queries in 
the first place?
Are you finding that with stopword removal and stemming you are not matching 
some docs that you think should be matched?  If so, we may be able to help if 
you provide a few examples.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Cory Ondrejka cory.ondre...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, November 14, 2009 12:57:56 PM
 Subject: converting over from sphinx
 
 I've been using Sphinx for full text search, but since I want to move my
 project over to Heroku, need to switch to Solr. Everything's up and running
 using the acts_as_solr plugin, but I'm curious if I'm using Solr the right
 way.  In particular, I'm doing phrase searching into a corpus of
 descriptions, such as I need help with a foo where I have a bunch of foo:
 a foo is a subset of a bar often used to create briznatzes, etc.
 
 With Sphinx, I could convert I need help with a foo into *need* *help*
 *with* *foo* and get pretty nice matches. With Solr, my understanding is
 that you can only do wildcard matches on the suffix. In addition, stemming
 only happens on non-wildcard terms. So, my first thought would be to convert
 I need help with a foo into need need* help help* with with* foo foo*.
 
 Thanks in advance for any help.
 
 -- 
 Cory Ondrejka
 cory.ondre...@gmail.com
 http://ondrejka.net/



RE: Is there a way to skip cache for a query

2009-11-15 Thread Jake Brownell
See https://issues.apache.org/jira/browse/SOLR-1363 -- it's currently scheduled 
for 1.5.

Jake

-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Sunday, November 15, 2009 11:17 PM
To: solr-user@lucene.apache.org
Subject: Re: Is there a way to skip cache for a query

I don't think that is supported today.  It might be useful, though (e.g. 
something I'd use with an external monitoring service, so that it doesn't 
always get fast results from the cache).


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Bertie Shen bertie.s...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, November 14, 2009 9:43:25 PM
 Subject: Is there a way to skip cache for a query
 
 Hey,
 
   I do not want to disable cache completely by changing the setting in
 solrconfig.xml. I just want to sometimes skip cache for a query for  testing
 purpose. So is there a parameter like skipcache=true to specify in
 select/?q=hotversion=2.2start=0rows=10skipcache=true to skip cache for
 the query [hot]. skipcache can by default be false.
 
 Thanks.



Re: Some guide about setting up local/geo search at solr

2009-11-15 Thread Otis Gospodnetic
Nota bene:
My understanding is the external versions of Local Lucene/Solr are eventually 
going to be deprecated in favour of what we have in contrib.  Here's a stub 
page with a link to the spatial JIRA issue: 
http://wiki.apache.org/solr/SpatialSearch

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Bertie Shen bertie.s...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, November 14, 2009 3:32:01 AM
 Subject: Some guide about setting up local/geo search at solr
 
 Hey,
 
 I spent some times figuring out how to set up local/geo/spatial search at
 solr. I hope the following description can help  given the current status.
 
 1) Download localsolr. I download it from
 http://developer.k-int.com/m2snapshots/localsolr/localsolr/1.5/ and put jar
 file (in my case, localsolr-1.5.jar) in your application's WEB_INF/lib
 directory of application server.
 
 2) Download locallucene. I download it from
 http://sourceforge.net/projects/locallucene/ and put jar file (in my case,
 locallucene.jar in locallucene_r2.0/dist/ diectory) in your application's
 WEB_INF/lib directory of application server. I also need to copy
 gt2-referencing-2.3.1.jar, geoapi-nogenerics-2.1-M2.jar, and jsr108-0.01.jar
 under locallucene_r2.0/lib/ directory to WEB_INF/lib. Do not copy
 lucene-spatial-2.9.1.jar under Lucene codebase. The namespace has been
 changed from com.pjaol.blah.blah.blah to org.apache.blah blah.
 
 3) Update your solrconfig.xml and schema.xml. I copy it from
 http://www.gissearch.com/localsolr.
 
 4) Restart application server and try a query
 /solr/select?qt=geolat=xx.xxlong=yy.yyq=abcradius=zz.



Re: exclude some fields from copying dynamic fields | schema.xml

2009-11-15 Thread Vicky_Dev

Thanks for response

Defining field is not working :( 

Is there any way to stop copy task for particular set of values

Thanks
~Vikrant



Lance Norskog-2 wrote:
 
 There is no direct way.
 
 Let's say you have a nocopy_s and you do not want a copy
 nocopy_str_s. This might work: declare nocopy_str_s as a field and
 make it not indexed and not stored. I don't know if this will work.
 
 It requires two overrides to work: 1) that declaring a field name that
 matches a wildcard will override the default wildcard rule, and 2)
 that stored=false indexed=false works.
 
 On Fri, Nov 13, 2009 at 3:23 AM, Vicky_Dev
 vikrantv_shirbh...@yahoo.co.in wrote:

 Hi,
 we are using the following entry in schema.xml to make a copy of one type
 of
 dynamic field to another :
 copyField source=*_s dest=*_str_s /

 Is it possible to exclude some fields from copying.

 We are using Solr1.3

 ~Vikrant

 --
 View this message in context:
 http://old.nabble.com/exclude-some-fields-from-copying-dynamic-fields-%7C-schema.xml-tp26335109p26335109.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 Lance Norskog
 goks...@gmail.com
 
 

-- 
View this message in context: 
http://old.nabble.com/exclude-some-fields-from-copying-dynamic-fields-%7C-schema.xml-tp26335109p26367099.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: javabin in .NET?

2009-11-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
For a client the marshal() part is not important.unmarshal() is
probably all you need

On Sun, Nov 15, 2009 at 12:36 AM, Mauricio Scheffer
mauricioschef...@gmail.com wrote:
 Original code is here: http://bit.ly/hkCbI
 I just started porting it here: http://bit.ly/37hiOs
 It needs: tests/debugging, porting NamedList, SolrDocument, SolrDocumentList
 Thanks for any help!

 Cheers,
 Mauricio

 2009/11/14 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com

 OK. Is there anyone trying it out? where is this code ? I can try to help
 ..

 On Fri, Nov 13, 2009 at 8:10 PM, Mauricio Scheffer
 mauricioschef...@gmail.com wrote:
  I meant the standard IO libraries. They are different enough that the
 code
  has to be manually ported. There were some automated tools back when
  Microsoft introduced .Net, but IIRC they never really worked.
 
  Anyway it's not a big deal, it should be a straightforward job. Testing
 it
  thoroughly cross-platform is another thing though.
 
  2009/11/13 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com
 
  The javabin format does not have many dependencies. it may have 3-4
  classes an that is it.
 
  On Fri, Nov 13, 2009 at 6:05 PM, Mauricio Scheffer
  mauricioschef...@gmail.com wrote:
   Nope. It has to be manually ported. Not so much because of the
 language
   itself but because of differences in the libraries.
  
  
   2009/11/13 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com
  
   Is there any tool to directly port java to .Net? then we can etxract
   out the client part of the javabin code and convert it.
  
   On Thu, Nov 12, 2009 at 9:56 PM, Erik Hatcher 
 erik.hatc...@gmail.com
   wrote:
Has anyone looked into using the javabin response format from .NET
   (instead
of SolrJ)?
   
It's mainly a curiosity.
   
How much better could performance/bandwidth/throughput be?  How
  difficult
would it be to implement some .NET code (C#, I'd guess being the
 best
choice) to handle this response format?
   
Thanks,
       Erik
   
   
  
  
  
   --
   -
   Noble Paul | Principal Engineer| AOL | http://aol.com
  
  
 
 
 
  --
  -
  Noble Paul | Principal Engineer| AOL | http://aol.com
 
 



 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: javabin in .NET?

2009-11-15 Thread Noble Paul നോബിള്‍ नोब्ळ्
start with a JavabinDecoder only so that the class is simple to start with.

2009/11/16 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 For a client the marshal() part is not important.unmarshal() is
 probably all you need

 On Sun, Nov 15, 2009 at 12:36 AM, Mauricio Scheffer
 mauricioschef...@gmail.com wrote:
 Original code is here: http://bit.ly/hkCbI
 I just started porting it here: http://bit.ly/37hiOs
 It needs: tests/debugging, porting NamedList, SolrDocument, SolrDocumentList
 Thanks for any help!

 Cheers,
 Mauricio

 2009/11/14 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com

 OK. Is there anyone trying it out? where is this code ? I can try to help
 ..

 On Fri, Nov 13, 2009 at 8:10 PM, Mauricio Scheffer
 mauricioschef...@gmail.com wrote:
  I meant the standard IO libraries. They are different enough that the
 code
  has to be manually ported. There were some automated tools back when
  Microsoft introduced .Net, but IIRC they never really worked.
 
  Anyway it's not a big deal, it should be a straightforward job. Testing
 it
  thoroughly cross-platform is another thing though.
 
  2009/11/13 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com
 
  The javabin format does not have many dependencies. it may have 3-4
  classes an that is it.
 
  On Fri, Nov 13, 2009 at 6:05 PM, Mauricio Scheffer
  mauricioschef...@gmail.com wrote:
   Nope. It has to be manually ported. Not so much because of the
 language
   itself but because of differences in the libraries.
  
  
   2009/11/13 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com
  
   Is there any tool to directly port java to .Net? then we can etxract
   out the client part of the javabin code and convert it.
  
   On Thu, Nov 12, 2009 at 9:56 PM, Erik Hatcher 
 erik.hatc...@gmail.com
   wrote:
Has anyone looked into using the javabin response format from .NET
   (instead
of SolrJ)?
   
It's mainly a curiosity.
   
How much better could performance/bandwidth/throughput be?  How
  difficult
would it be to implement some .NET code (C#, I'd guess being the
 best
choice) to handle this response format?
   
Thanks,
       Erik
   
   
  
  
  
   --
   -
   Noble Paul | Principal Engineer| AOL | http://aol.com
  
  
 
 
 
  --
  -
  Noble Paul | Principal Engineer| AOL | http://aol.com
 
 



 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com





 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com




-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Solr date and string search problem

2009-11-15 Thread ashokcz

Hi Lance Norskog ,
Thanks for your reply.

Let me first put the config files details.
These are the fields i have defined 

 fieldType class=solr.TextField name=alphaOnlySort omitNorms=true
sortMissingLast=true
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
filter class=solr.LowerCaseFilterFactory/   
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldType


fieldType class=solr.TextField name=alphaOnlySortFacet
omitNorms=true sortMissingLast=true
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/   
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldType

fieldType class=solr.TextField name=specialFacet omitNorms=true
sortMissingLast=true
  analyzer type=query
tokenizer class=solr.CommaTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldType
 /types


field indexed=true multiValued=true name=text stored=false
type=text/   
field indexed=true name=id required=true stored=true
type=string/
field indexed=true name=status stored=true
type=alphaOnlySortFacet/
field indexed=false name=noofViews stored=true type=integer/
field indexed=true name=uploadedBy stored=true type=text/
field indexed=true name=uploadedOn stored=true type=date/
field indexed=true name=popularity stored=true type=float/
field indexed=true name=Plant stored=true type=specialFacet/
field indexed=true name=PlantSearch stored=true
type=alphaOnlySort/
field indexed=true name=Geography stored=true
type=alphaOnlySortFacet/
field indexed=true name=GeographySearch stored=true
type=alphaOnlySort/
field indexed=true name=Region stored=true
type=alphaOnlySortFacet/
field indexed=true name=RegionSearch stored=true
type=alphaOnlySort/
field indexed=true name=Country stored=true
type=alphaOnlySortFacet/
field indexed=true name=CountrySearch stored=true
type=alphaOnlySort/
field indexed=true name=BusUnit stored=true type=specialFacet/
field indexed=true name=BusUnitSearch stored=true
type=alphaOnlySort/
field indexed=true name=BusinessFunction stored=true
type=alphaOnlySortFacet/
field indexed=true name=BusinessFunctionSearch stored=true
type=alphaOnlySort/
field indexed=true name=Functionality stored=true
type=alphaOnlySortFacet/
field indexed=true name=FunctionalitySearch stored=true
type=alphaOnlySort/
field indexed=true name=Businessprocesses stored=true type=text/  
field indexed=true name=UploadedDate stored=true type=date/


and this is my requestHandler configuration 


requestHandler class=solr.DisMaxRequestHandler name=dismaxRelAndPop
lst name=defaults
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 str name=qfPlantSearch^1 GeographySearch^1 RegionSearch^1
CountrySearch^1 BusUnitSearch^1 BusinessFunctionSearch^1 
Businessprocesses^1 LifecycleStatus^1 ApplicationNature^1 UploadedDate^1 
/str
 str name=pfPlantSearch^1 GeographySearch^1 RegionSearch^1
CountrySearch^1 BusUnitSearch^1 BusinessFunctionSearch^1 
Businessprocesses^1 LifecycleStatus^1 ApplicationNature^1 UploadedDate^1 
/str
 str name=fl*,score/str
 str name=bf
ord(popularity)^0.5 recip(rord(popularity),1,1000,1000)^0.3
 /str
 str name=q.alt*:*/str
 str name=mm
10lt;50%
 /str
/lst
  /requestHandler


  and this is the query thats been fired.


 
facet.limit=-1rows=10start=0facet=truefacet.mincount=1facet.field=Geographyfacet.field=Countryfacet.field=Functionalityfacet.field=BusinessFunctionfacet.field=BusUnitfacet.field=Regionfacet.field=PGServiceManagerfacet.field=AppNamefacet.field=Plantfacet.field=statusq=Behaviorfacet.sort=true

  i clearly understand where the problem is happening , but dont know how to
resolve it .

  i have defined UploadedDate as date field and i have defined in my request
handler to search in UploadedDate field also  (   UploadedDate^1 .)

  but what happens is every query that is been fired is converted to date
and it throws me an error.
  if i remove UploadedDate from  request handler it works fine.

  so i dont know how to have some tring fields and some date fields together
co exist in a request handler  ??
  and according to the given query solr should filter it out in all the
fields and should give me the result back .
  is there an way to do tat??
  sorry for a such a long repsone :)

  thanks 
  ---
  Ashok


  



Lance Norskog-2 wrote:
 
 This line is the key:
 SEVERE: org.apache.solr.core.SolrException: Invalid Date
 String:'Behavior'
at org.apache.solr.schema.DateField.toInternal(DateField.java:108)
at
 
 The string 'Behavior' is being parsed as a date, and fails. Your query
 is attempting to find this as a date. Please post your query. 

Re: Newbie tips: migrating from mysql fulltext search / PHP integration

2009-11-15 Thread Israel Ekpo
On Mon, Nov 16, 2009 at 12:34 AM, Mattmann, Chris A (388J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 WOW, +1!! Great job, PHP!

 Cheers,
 Chris



 On 11/15/09 10:13 PM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hi,

 I'm not sure if you have a specific question there.
 But regarding PHP integration part, I just learned PHP now has native
 Solr (1.3 and 1.4) support:

  http://twitter.com/otisg/status/5757184282


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: mbneto mbn...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Sun, November 15, 2009 4:56:15 PM
  Subject: Newbie tips: migrating from mysql fulltext search / PHP
 integration
 
  Hi,
 
  I am looking for alternatives to MySQL fulltext searches.  The combo
  Lucene/Solr is one of my options and I'd like to gather as much
 information
  I can before choosing and even build a prototype.
 
  My current need does not seem to be different.
 
  - fast response time (currently some searches can take more than 11sec)
  - API to add/update/delete documents to the collection
  - way to add synonymous or similar words for misspelled ones (ex. Sony =
  Soni)
  - way to define relevance of results (ex. If I search for LCD return
  products that belong to the LCD category, contains LCD in the product
  definition or ara marked as special offer)
 
  I know that I may have to add external code, for example, to take the
  results and apply some business logic to resort the results but I'd like
 to
  know, besides the wiki and the solr 1.4 Enterprise Seacrh Server book
 (which
  I am considering to buy) the tips for solr usage.



 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.mattm...@jpl.nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/http://sunset.usc.edu/%7Emattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++




Hi,

There is native support for Solr in PHP but currently you have to build it
as a PECL extension.

It is currently not bundled with the PHP source yet but it is down loadable
from the PECL project homepage

http://pecl.php.net/package/solr

If you currently have pecl support built into your php installation you can
install it by running the following command

pecl install solr-beta

Some usage examples are available here

http://us3.php.net/manual/en/solr.examples.php

More details are available here

http://www.php.net/manual/en/book.solr.php

I use Solr with PHP 5.2

- In PHP, the SolrClient class has methods to add, update, delete and
rollback changes to the index made since the last commit.
- There are also built-in tools in Solr that allow you to analyze and modify
the data before indexing it and when searching for it.
- with Solr you can define synonyms (check the wiki for more details)
- Solr also allows you to sort by score (relevance)
- You can specify the fields that you want either as (optional, required or
prohibited)

My last two points could take care of your last requirement.

Solr is awesome and most of the search I perform return sub-second response
times.

Its several hundred folds easier and more efficient than MySQL fulltext.
believe me.
-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.