How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread ahmed baseet
Hi All,
I'm trying to post some files to Solr server. I've done this using the
post.jar files for posting xml files residing on my local disk[I tried
posting all those xml files from example directory]. Now I'm trying to
generate xml files on the fly, with required text to be indexed included
therein though, and want to post these files to solr. As per the examples
we've used SimplePostTool for posting locally resinding files but can some
one give me direction on indexing in-memory xml files[files generated on the
fly]. Actually I want to automate this process in a loop, so that I'll
extract some information and put that to xml file and push it off to Solr
for indexing.
Thanks in appreciation.

--Ahmed.


Re: How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread Shalin Shekhar Mangar
On Mon, Apr 27, 2009 at 3:30 PM, ahmed baseet ahmed.bas...@gmail.comwrote:

 Hi All,
 I'm trying to post some files to Solr server. I've done this using the
 post.jar files for posting xml files residing on my local disk[I tried
 posting all those xml files from example directory]. Now I'm trying to
 generate xml files on the fly, with required text to be indexed included
 therein though, and want to post these files to solr. As per the examples
 we've used SimplePostTool for posting locally resinding files but can
 some
 one give me direction on indexing in-memory xml files[files generated on
 the
 fly]. Actually I want to automate this process in a loop, so that I'll
 extract some information and put that to xml file and push it off to Solr
 for indexing.
 Thanks in appreciation.



You can use the Solrj client to avoid building the intermediate XML
yourself. Extract the information, use the Solrj api to add the extracted
text to fields and send them to the solr server.

http://wiki.apache.org/solr/Solrj

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread ahmed baseet
Hi,
After going through the solrj wiki I found that we've to set some
dependencies in pom.xml for using Solrj, which I haven't done yet. So I
googled to know how to do that but no help. I searched the solr directory
and found a bunch of *-pom.template files [like solr-core-pom.xml,
solr-solrj-pom.xml etc] and I'm not able to figure out which one to use. Any
help would be appreciated.

Thanks,
Ahmed.

On Mon, Apr 27, 2009 at 4:53 PM, ahmed baseet ahmed.bas...@gmail.comwrote:

 Shalin, thanks for your quick response.

 Actually I'm trying to pull plaintext from html pages and trying to make
 xml files for each page. I went through the SolrJ webpage and found that the
 we've to add all the field and its contents anyway, right? but yes it makes
 adding/updating etc quite easier than using that SimplePostTool.
  I tried to use SolrJ client but it doesnot seem to be working. I added all
 the jar files mentioned in SolrJ wiki to classpath but still its giving me
 some error.

 To be precise it gives me the following error,
  .cannot find symbol:
 symbol : class CommonsHttpSolrServer

 I rechecked to make sure that commons-httpclient-3.1.jar is in the class
 path. Can someone please point me what is the issue?

 I'm working on Windows and my classpath variable is this:

 .;E:\Program Files\Java\jdk1.6.0_05\bin;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-httpclient-3.1.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-common.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-1.3.0.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\solr-solrj-1.3.0.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-io-1.3.1.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-codec-1.3.jar;D:\firefox
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-logging-1.0.4.jar

 Thank you very much.
 Ahmed.



 On Mon, Apr 27, 2009 at 3:55 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 On Mon, Apr 27, 2009 at 3:30 PM, ahmed baseet ahmed.bas...@gmail.com
 wrote:

  Hi All,
  I'm trying to post some files to Solr server. I've done this using the
  post.jar files for posting xml files residing on my local disk[I tried
  posting all those xml files from example directory]. Now I'm trying to
  generate xml files on the fly, with required text to be indexed included
  therein though, and want to post these files to solr. As per the
 examples
  we've used SimplePostTool for posting locally resinding files but can
  some
  one give me direction on indexing in-memory xml files[files generated on
  the
  fly]. Actually I want to automate this process in a loop, so that I'll
  extract some information and put that to xml file and push it off to
 Solr
  for indexing.
  Thanks in appreciation.
 


 You can use the Solrj client to avoid building the intermediate XML
 yourself. Extract the information, use the Solrj api to add the extracted
 text to fields and send them to the solr server.

 http://wiki.apache.org/solr/Solrj

 --
 Regards,
 Shalin Shekhar Mangar.





Re: How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread Shalin Shekhar Mangar
On Mon, Apr 27, 2009 at 4:53 PM, ahmed baseet ahmed.bas...@gmail.comwrote:


 To be precise it gives me the following error,
  .cannot find symbol:
 symbol : class CommonsHttpSolrServer

 I rechecked to make sure that commons-httpclient-3.1.jar is in the class
 path. Can someone please point me what is the issue?

 I'm working on Windows and my classpath variable is this:

 .;E:\Program Files\Java\jdk1.6.0_05\bin;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-httpclient-3.1.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-common.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-1.3.0.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\solr-solrj-1.3.0.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-io-1.3.1.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-codec-1.3.jar;D:\firefox

 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-logging-1.0.4.jar


The jars look right. It is likely a problem with your classpath.
CommonsHttpSolrServer is in the solr-solrj jar.

If you are using Maven, then you'd need to change your pom.xml

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread ahmed baseet
Can anyone help me selecting the proper pom.xml file out of the bunch of
*-pom.xml.templates available.
I got the following when searched for pom.xml files,
solr-common-csv-pom.xml
solr-lucene-analyzers-pom.xml
solr-lucene-contrib-pom.xml
solr-lucene-*-pom.xml [ a lot of solr-lucene-... pom files are available,
hence shortened to avoid typing all]
solr-dataimporthandler-pom.xml
solr-common-pom.xml
solr-core-pom.xml
solr-parent-pom.xml
solr-solr-pom.xml

Thanks,
Ahmed.

On Mon, Apr 27, 2009 at 5:38 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Mon, Apr 27, 2009 at 4:53 PM, ahmed baseet ahmed.bas...@gmail.com
 wrote:

 
  To be precise it gives me the following error,
   .cannot find symbol:
  symbol : class CommonsHttpSolrServer
 
  I rechecked to make sure that commons-httpclient-3.1.jar is in the
 class
  path. Can someone please point me what is the issue?
 
  I'm working on Windows and my classpath variable is this:
 
  .;E:\Program Files\Java\jdk1.6.0_05\bin;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-httpclient-3.1.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-common.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\apache-solr-1.3.0.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\solr-solrj-1.3.0.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-io-1.3.1.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-codec-1.3.jar;D:\firefox
 
 
 download\apache-solr-1.3.0\apache-solr-1.3.0\dist\solrj-lib\commons-logging-1.0.4.jar
 

 The jars look right. It is likely a problem with your classpath.
 CommonsHttpSolrServer is in the solr-solrj jar.

 If you are using Maven, then you'd need to change your pom.xml

 --
 Regards,
 Shalin Shekhar Mangar.



Re: How to post in-memory[not residing on local disks] Xml files to Solr server for indexing?

2009-04-27 Thread Shalin Shekhar Mangar
On Mon, Apr 27, 2009 at 6:27 PM, ahmed baseet ahmed.bas...@gmail.comwrote:

 Can anyone help me selecting the proper pom.xml file out of the bunch of
 *-pom.xml.templates available.


Ahmed, are you using Maven? If not, then you do not need these pom files. If
you are using Maven, then you need to add a dependency to solrj.

http://wiki.apache.org/solr/Solrj#head-674dd7743df665fdd56e8eccddce16fc2de20e6e

-- 
Regards,
Shalin Shekhar Mangar.


Re: Date faceting - howto improve performance

2009-04-27 Thread Ning Li
You mean doc A and doc B will become one doc after adding index 2 to
index 1? I don't think this is currently supported either at Lucene
level or at Solr level. If index 1 has m docs and index 2 has n docs,
index 1 will have m+n docs after adding index 2 to index 1. Documents
themselves are not modified by index merge.

Cheers,
Ning


On Sat, Apr 25, 2009 at 4:03 PM, Marcus Herou
marcus.he...@tailsweep.com wrote:
 Hmm looking in the code for the IndexMerger in Solr
 (org.apache.solr.update.DirectUpdateHandler(2)

 See that the IndexWriter.addIndexesNoOptimize(dirs) is used (union of
 indexes) ?

 And the test class org.apache.solr.client.solrj.MergeIndexesExampleTestBase
 suggests:
 add doc A to index1 with id=AAA,name=core1
 add doc B to index2 with id=BBB,name=core2
 merge the two indexes into one index which then contains both docs.
 The resulting index will have 2 docs.

 Great but in my case I think it should work more like this.

 add doc A to index1 with id=X,title=blog entry title,description=blog entry
 description
 add doc B to index2 with id=X,score=1.2
 somehow add index2 to index1 so id=XX has score=1.2 when searching in index1
 The resulting index should have 1 doc.

 So this is not really what I want right ?

 Sorry for being a smart-ass...

 Kindly

 //Marcus





 On Sat, Apr 25, 2009 at 5:10 PM, Marcus Herou 
 marcus.he...@tailsweep.comwrote:

 Guys!

 Thanks for these insights, I think we will head for Lucene level merging
 strategy (two or more indexes).
 When merging I guess the second index need to have the same doc ids
 somehow. This is an internal id in Lucene, not that easy to get hold of
 right ?

 So you are saying the the solr: ExternalFileField + FunctionQuery stuff
 would not work very well performance wise or what do you mean ?

 I sure like bleeding edge :)

 Cheers dudes

 //Marcus





 On Sat, Apr 25, 2009 at 3:46 PM, Otis Gospodnetic 
 otis_gospodne...@yahoo.com wrote:


 I should emphasize that the PR trick I mentioned is something you'd do at
 the Lucene level, outside Solr, and then you'd just slip the modified index
 back into Solr.
 Of, if you like the bleeding edge, perhaps you can make use of Ning Li's
 Solr index merging functionality (patch in JIRA).


 Otis --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Otis Gospodnetic otis_gospodne...@yahoo.com
  To: solr-user@lucene.apache.org
  Sent: Saturday, April 25, 2009 9:41:45 AM
  Subject: Re: Date faceting - howto improve performance
 
 
  Yes, you could simply round the date, no need for a non-date type field.
  Yes, you can add a field after the fact by making use of ParallelReader
 and
  merging (I don't recall the details, search the ML for ParallelReader
 and
  Andrzej), I remember he once provided the working recipe.
 
 
  Otis --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
   From: Marcus Herou
   To: solr-user@lucene.apache.org
   Sent: Saturday, April 25, 2009 6:54:02 AM
   Subject: Date faceting - howto improve performance
  
   Hi.
  
   One of our faceting use-cases:
   We are creating trend graphs of how many blog posts that contains a
 certain
   term and groups it by day/week/year etc. with the nice DateMathParser
   functions.
  
   The performance degrades really fast and consumes a lot of memory
 which
   forces OOM from time to time
   We think it is due the fact that the cardinality of the field
 publishedDate
   in our index is huge, almost equal to the nr of documents in the
 index.
  
   We need to address that...
  
   Some questions:
  
   1. Can a datefield have other date-formats than the default of
 -MM-dd
   HH:mm:ssZ ?
  
   2. We are thinking of adding a field to the index which have the
 format
   -MM-dd to reduce the cardinality, if that field can't be a date,
 it
   could perhaps be a string, but the question then is if faceting can be
 used
   ?
  
   3. Since we now already have such a huge index, is there a way to add
 a
   field afterwards and apply it to all documents without actually
 reindexing
   the whole shebang ?
  
   4. If the field cannot be a string can we just leave out the
   hour/minute/second information and to reduce the cardinality and
 improve
   performance ? Example: 2009-01-01 00:00:00Z
  
   5. I am afraid that we need to reindex everything to get this to work
   (negates Q3). We have 8 shards as of current, what would the most
 efficient
   way be to reindexing the whole shebang ? Dump the entire database to
 disk
   (sigh), create many xml file splits and use curl in a
   random/hash(numServers) manner on them ?
  
  
   Kindly
  
   //Marcus
  
  
  
  
  
  
  
   --
   Marcus Herou CTO and co-founder Tailsweep AB
   +46702561312
   marcus.he...@tailsweep.com
   http://www.tailsweep.com/
   http://blogg.tailsweep.com/




 --
 Marcus Herou CTO and co-founder Tailsweep AB
 +46702561312
 marcus.he...@tailsweep.com
 http://www.tailsweep.com/
 

RE: How to index the contents from SVN repository

2009-04-27 Thread Steven A Rowe
Hi Ashish,

The excellent SVN/CVS repo browser ViewVC http://www.viewvc.org/ has tools to 
record SVN/CVS commit metadata in a database - seeing how they do it may give 
you some hints.

The INSTALL file gives pointers to the relevant tools (look for the SQL 
CHECKIN DATABASE section):

http://viewvc.tigris.org/svn/viewvc/trunk/INSTALL

ViewVC doesn't have file content search capabilities yet - maybe while you're 
at it, you could contribute your work to that project :).

Good luck,
Steve

On 4/27/2009 at 1:12 AM, Ashish P wrote:
 Right. But is there a way to track file updates and diffs.
 Thanks,
 Ashish
 
 Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
  If you can check it out into a directory using SVN command then you
  may use DIH to index the content.
 
  a combination of FileListEntityProcessor and PlainTextEntityProcessor
  may help
 
  On Sun, Apr 26, 2009 at 1:38 PM, Ashish P ashish.ping...@gmail.com
  wrote:
 
  Is there any way to index contents of SVN rep in Solr ??


Solr 1.4 Release Date

2009-04-27 Thread Gurjot Singh
Hi, I am curious to know when is the scheduled/tentative release date of
Solr 1.4.

Thanks,
Gurjot


Configuration of format and type index with solr

2009-04-27 Thread hpn1975 nasc
Hi,

  I work with Lucne there is some years and I use some advanced resources of
the library as different formats of index and types of persistency. Now I
would like use Solr.

  Is possible to configure these resources using solr ? My doubt is about of
possibility of configurate in solr this four themes:

   1- Guarantee that my searcher (solr) ALWAYS search in my index in *memory
* (use RAMDirectory). Not to use cache.
   2- Guarantee that my searcher (solr) ALWAYS search in my index in *file
system* (use FSDirectory).
   3- Persist my index is genereted in only one archive in File System
(optimized)
   4- Persist my index (RAMDirectory) in serializable archive java. I need
create a loader that load my .ser and deseriaze the class RAMDirectory e set
in the searcher class. Is possible add any component that manipule the index
?

  Thanks

  Haroldo


Re: How to index the contents from SVN repository

2009-04-27 Thread Ryan McKinley

I would suggest looking at Apache commons VFS and using the solrj API:

http://commons.apache.org/vfs/

With SVN, you may be able to use the webdav provider.

ryan



On Apr 26, 2009, at 4:08 AM, Ashish P wrote:



Is there any way to index contents of SVN rep in Solr ??
--
View this message in context: 
http://www.nabble.com/How-to-index-the-contents-from-SVN-repository-tp23240110p23240110.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr Performance bottleneck

2009-04-27 Thread Jon Bodner

As a follow-up note, we solved our problem by moving the indexes to local
store and upgrading to Solr 1.4.  I did a thread dump against our 1.3 Solr
instance and it was spending lots of time blocking on index section loading. 
The NIO implementation in 1.4 solved that problem and copying to local store
almost certainly reduced file loading time.

Trying to point multiple Solrs  on multiple boxes at a single shared
directory is almost certainly doomed to failure; the read-only Solrs won't
know when the read/write Solr instance has updated the index.

We are going to try to move our indexes back to shared disk, as our backup
solutions are all tied to the shared disk.  Also, if an individual box
fails, we can bring up a new box and point it at the shared disk.  Are there
any known problems with NIO and NFS that will cause this to fail?  Can
anyone suggest a better solution?

Thanks,

Jon

-- 
View this message in context: 
http://www.nabble.com/Solr-Performance-bottleneck-tp23209595p23262198.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr Performance bottleneck

2009-04-27 Thread Walter Underwood
This isn't a new problem, NFS was 100X slower than local disk for me
with Solr 1.1.

Backing up indexes is very tricky. You need to do it while the are
not being updated, or you'll get a corrupt copy. If your indexes
aren't large, you are probably better off backing up the source
documents and building new indexes from scratch.

wunder

On 4/27/09 11:27 AM, Jon Bodner jbod...@blackboard.com wrote:

 
 As a follow-up note, we solved our problem by moving the indexes to local
 store and upgrading to Solr 1.4.  I did a thread dump against our 1.3 Solr
 instance and it was spending lots of time blocking on index section loading.
 The NIO implementation in 1.4 solved that problem and copying to local store
 almost certainly reduced file loading time.
 
 Trying to point multiple Solrs  on multiple boxes at a single shared
 directory is almost certainly doomed to failure; the read-only Solrs won't
 know when the read/write Solr instance has updated the index.
 
 We are going to try to move our indexes back to shared disk, as our backup
 solutions are all tied to the shared disk.  Also, if an individual box
 fails, we can bring up a new box and point it at the shared disk.  Are there
 any known problems with NIO and NFS that will cause this to fail?  Can
 anyone suggest a better solution?
 
 Thanks,
 
 Jon



adding plug-in after search is done

2009-04-27 Thread siping liu

trying to manipulate search result (like further filtering out unwanted), and 
ordering the results differently. Where is the suitable place for doing it? 
I've been using QueryResponseWriter but that doesn't seem to be the right place.

thanks.

_
Rediscover Hotmail®: Get quick friend updates right in your inbox. 
http://windowslive.com/RediscoverHotmail?ocid=TXT_TAGLM_WL_HM_Rediscover_Updates2_042009

fail to create or find snapshoot

2009-04-27 Thread Jian Han Guo
Hi,

According to Solr's wiki page http://wiki.apache.org/solr/SolrReplication,
if I send the following request to master, a snapshoot will be created

http://master_host:port/solr/replication?command=snapshoothttp://master_host/solr/replication?command=snapshoot


But after I did it, nothing seemed happening.

I got this response back,

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime2/int/lst
/response

and I checked the data directory, no snapshoot was created.

I am not sure what to expect after making the request, and where to find the
snapshoot files (and what they are).

Thanks,

Jianhan


Re: fail to create or find snapshoot

2009-04-27 Thread Jian Han Guo
Actually, I found the snapshot in the directory where solr was lauched. Is
this done on purpose? shouldn't it be in the data directory?

Thanks,

Jianhan


On Mon, Apr 27, 2009 at 11:43 AM, Jian Han Guo jian...@gmail.com wrote:

 Hi,

 According to Solr's wiki page http://wiki.apache.org/solr/SolrReplication,
 if I send the following request to master, a snapshoot will be created

 http://master_host:port/solr/replication?command=snapshoothttp://master_host/solr/replication?command=snapshoot


 But after I did it, nothing seemed happening.

 I got this response back,

 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime2/int/lst
 /response

 and I checked the data directory, no snapshoot was created.

 I am not sure what to expect after making the request, and where to find
 the snapshoot files (and what they are).

 Thanks,

 Jianhan








Re: Configuration of format and type index with solr

2009-04-27 Thread Shalin Shekhar Mangar
On Mon, Apr 27, 2009 at 10:40 PM, hpn1975 nasc hpn1...@gmail.com wrote:


   1- Guarantee that my searcher (solr) ALWAYS search in my index in *memory
 * (use RAMDirectory). Not to use cache.


It is possible to disable all caches. But it is not possible to use
RAMDirectory right now. This is in progress.

https://issues.apache.org/jira/browse/SOLR-465



   2- Guarantee that my searcher (solr) ALWAYS search in my index in *file
 system* (use FSDirectory).


Yes, that is the default and only way currently.



   3- Persist my index is genereted in only one archive in File System
 (optimized)


The useCompoundFile setting in solrconfig.xml can help here.



   4- Persist my index (RAMDirectory) in serializable archive java. I need
 create a loader that load my .ser and deseriaze the class RAMDirectory e
 set
 in the searcher class.


I don't think you can use RAMDirectory right now. However if the use-case
behind serializing a ram directory is for replication, then there are
alternate methods available.

http://wiki.apache.org/solr/CollectionDistribution
http://wiki.apache.org/solr/SolrReplication


 Is possible add any component that manipule the index
 ?


Yes. You can write your own request handlers and search components.

-- 
Regards,
Shalin Shekhar Mangar.


Re: adding plug-in after search is done

2009-04-27 Thread Shalin Shekhar Mangar
On Tue, Apr 28, 2009 at 12:04 AM, siping liu siping...@hotmail.com wrote:


 trying to manipulate search result (like further filtering out unwanted),
 and ordering the results differently. Where is the suitable place for doing
 it? I've been using QueryResponseWriter but that doesn't seem to be the
 right place.


You should probably look at writing your own SearchComponent. Also look at
the QueryElevationComponent which can help with fixing the positions of some
documents in the result set.

-- 
Regards,
Shalin Shekhar Mangar.


Re: offline solr indexing

2009-04-27 Thread Shalin Shekhar Mangar
On Tue, Apr 28, 2009 at 12:38 AM, Charles Federspiel 
charles.federsp...@gmail.com wrote:

 Solr Users,
 Our app servers are setup on read-only filesystems.  Is there a way
 to perform indexing from the command line, then copy the index files to the
 app-server and use Solr to perform search from inside the servlet
 container?


If the filesystem is read-only, then how can you index at all?

But what I think you are describing is the regular master-slave setup that
we use. A dedicated master on which writes are performed. Multiple slaves on
which searches are performed. The index is replicated to slaves through
script or the new java based replication.


 If the Solr implementation is bound to http requests, can Solr perform
 searches against an index that I create with Lucene?
 thank you,


It can but it is a little tricky to get the schema and analysis correct
between your Lucene writer and Solr searcher.

-- 
Regards,
Shalin Shekhar Mangar.


Re: DataImportHandler Questions-Load data in parallel and temp tables

2009-04-27 Thread Shalin Shekhar Mangar
On Tue, Apr 28, 2009 at 3:43 AM, Amit Nithian anith...@gmail.com wrote:

 All,
 I have a few questions regarding the data import handler. We have some
 pretty gnarly SQL queries to load our indices and our current loader
 implementation is extremely fragile. I am looking to migrate over to the
 DIH; however, I am looking to use SolrJ + EmbeddedSolr + some custom stuff
 to remotely load the indices so that my index loader and main search engine
 are separated.


Currently if you want to use DIH then the Solr master doubles up as the
index loader as well.



 Currently, unless I am missing something, the data gathering from the
 entity
 and the data processing (i.e. conversion to a Solr Document) is done
 sequentially and I was looking to make this execute in parallel so that I
 can have multiple threads processing different parts of the resultset and
 loading documents into Solr. Secondly, I need to create temporary tables to
 store results of a few queries and use them later for inner joins was
 wondering how to best go about this?

 I am thinking to add support in DIH for the following:
 1) Temporary tables (maybe call it temporary entities)? --Specific only to
 SQL though unless it can be generalized to other sources.


Pretty specific to DBs. However, isn't this something that can be done in
your database with views?



 2) Parallel support


Parallelizing import of root-entities might be the easiest to attempt.
There's also an issue open to write to Solr (tokenization/analysis) in a
separate thread. Look at https://issues.apache.org/jira/browse/SOLR-1089

We actually wrote a multi-threaded DIH during the initial iterations. But we
discarded it because we found that the bottleneck was usually the database
(too many queries) or Lucene indexing itself (analysis, tokenization) etc.
The improvement was ~10% but it made the code substantially more complex.

The only scenario in which it helped a lot was when importing from HTTP or a
remote database (slow networks). But if you think it can help in your
scenario, I'd say go for it.



  - Including some mechanism to get the number of records (whether it be
 count or the MAX(custom_id)-MIN(custom_id))


Not sure what you mean here.



 3) Support in DIH or Solr to post documents to a remote index (i.e. create
 a
 new UpdateHandler instead of DirectUpdateHandler2).


Solrj integration would be helpful to many I think. There's an issue open.
Look at https://issues.apache.org/jira/browse/SOLR-853

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr test anyone?

2009-04-27 Thread Shalin Shekhar Mangar
Yes, look at AbstractSolrTestCase which is the base class of almost all Solr
tests.

http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/util/AbstractSolrTestCase.java

On Mon, Apr 27, 2009 at 6:38 PM, Eric Pugh
ep...@opensourceconnections.comwrote:

 Look into the test code that Solr uses, there is a lot of good stuff on how
 to do testing.
 http://svn.apache.org/repos/asf/lucene/solr/trunk/src/test/.

 Eric


 On Apr 27, 2009, at 6:25 AM, tarjei wrote:

  Hi, I'm looking for ways to test that my indexing methods work correctly
 with my Solr schema.

 Therefore I'm wondering if someone has created a test setup where they
 start a Solr instance and then add some documents to the instance - as a
 Junit/testng test - preferably with a working Maven dependencies for it as
 well.

 I've tried googling for this as well as setting it up myself, but I have
 never managed to get a test working like I want it to.


 Kind regards,
 Tarjei


 -
 Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 |
 http://www.opensourceconnections.com
 Free/Busy: http://tinyurl.com/eric-cal







-- 
Regards,
Shalin Shekhar Mangar.


Re: facet results in order of rank

2009-04-27 Thread Shalin Shekhar Mangar
On Fri, Apr 24, 2009 at 12:25 PM, ristretto.rb ristretto...@gmail.comwrote:

 Hello,

 Is it possible to order the facet results on some ranking score?
 I've had a look at the facet.sort param,
 (
 http://wiki.apache.org/solr/SimpleFacetParameters#head-569f93fb24ec41b061e37c702203c99d8853d5f1
 )
 but that seems to order the facet either by count or by index value
 (in my case alphabetical.)


Facets are not ranked because there is no criteria for determining relevancy
for them. They are just the count of documents for each term in a given
field computed for the current result set.



 We are facing a big number of facet results for multiple termed
 queries that are OR'ed together.  We want to keep the OR nature of our
 queries,
 but, we want to know which facet values are likely to give you higher
 ranked results.  We could AND together the terms, to get the facet
 list to be
 more manageable, but we would be filtering out too many results.  We
 prefer to OR terms and let the ranking bring the good stuff to the
 top.

 For example, suppose we have a index of all known animals and
 each doc has a field AO for animal-origin.

 Suppose we search for:  wolf grey forest Europe
 And generate facets AO.  We might get the following
 facet results:

 For the AO field, lots of countries of the world probably have grey or
 forest or wolf or Europe in their indexing data, so I'm asserting we'd
 get a big list here.
 But, only some of the countries will have all 4 terms, and those are
 the facets that will be the most interesting to drill down on.  Is
 there
 a way to figure out which facet is the most highly ranked like this?


Suppose 10 documents match the query you described. If you facet on AO, then
it would just go through all the terms in AO and give you the number of
documents which have that term. There's no question of relevance at all
here. The returned documents themselves are of course ranked according to
the relevancy score.

Perhaps I've misunderstood the query?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Get the field value that caused the result

2009-04-27 Thread Shalin Shekhar Mangar
On Sat, Apr 25, 2009 at 8:25 PM, Wouter Samaey wouter.sam...@gmail.comwrote:


 I'm looking into a way to determine the value of a field that caused
 the result to be returned.


Can highlighting help here? It returns the snipped from the document which
matched the query.

http://wiki.apache.org/solr/HighlightingParameters

-- 
Regards,
Shalin Shekhar Mangar.


Re: Authenticated Indexing Not working

2009-04-27 Thread Shalin Shekhar Mangar
On Sun, Apr 26, 2009 at 11:04 AM, Allahbaksh Asadullah 
allahbaks...@gmail.com wrote:

 HI Otis,
 I am using HTTPClient for authentication. When I use the server with
 Authentication for searching it works fine. But when I use it for
 indexing it throws error.


What is the error? Is it thrown by Solr or your servlet container?

One difference between a search request and update request with Solrj is
that a search request uses HTTP GET by default but an update request uses an
HTTP POST by default. Perhaps your authentication scheme is not configured
correctly for POST requests?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Phonetic analysis with the spell-check component?

2009-04-27 Thread Shalin Shekhar Mangar
On Sun, Apr 26, 2009 at 11:55 PM, David Smiley @MITRE.org dsmi...@mitre.org
 wrote:


 It appears to me that the spell-check component can't build a dictionary
 based on phonetic similarity (i.e. using a Phonetic analysis filter).
  Sure,
 you can go ahead and configure the spell check component to use a field
 type
 that uses a phonetic filter but the suggestions presented to the user are
 based on the indexed values (i.e. phonemes), not the original words.  Thus
 the user will be presented with a suggested phoneme which is a poor user
 experience.  It's not clear how this shortcoming could be rectified because
 for a given phoneme, there are potentially multiple words to choose from
 that could be encoded to a given phoneme.


Hmm. I think the problem here is that spell checker creates its own index
with the indexed tokens of a Solr field. So it does not have the original
words anymore. But if we could have an option to store the original words as
well into the spell check index, we could return them as suggestions.

Do you mind creating an Jira issue so that we don't forget about this?

-- 
Regards,
Shalin Shekhar Mangar.


Re: MacOS Failed to initialize DataSource:db+ DataimportHandler ???

2009-04-27 Thread gateway0

Hi,

sure:

message Severe errors in solr configuration. Check your log files for more
detailed information on what may be wrong. If you want solr to continue
after configuration errors, change:
abortOnConfigurationErrorfalse/abortOnConfigurationError in null
-
org.apache.solr.common.SolrException: FATAL: Could not create importer.
DataImporter config invalid at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311)
at org.apache.solr.core.SolrCore.init(SolrCore.java:480) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:525) at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:627)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488) at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1149) at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at
org.apache.catalina.core.StandardHost.start(StandardHost.java:719) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
org.apache.catalina.core.StandardService.start(StandardService.java:516) at
org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at
org.apache.catalina.startup.Catalina.start(Catalina.java:578) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288) at
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413) Caused by:
org.apache.solr.handler.dataimport.DataImportHandlerException: Failed to
initialize DataSource: mydb Processing Document # at
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:308)
at
org.apache.solr.handler.dataimport.DataImporter.addDataSource(DataImporter.java:273)
at
org.apache.solr.handler.dataimport.DataImporter.initEntity(DataImporter.java:228)
at
org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:98)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106)
... 31 more Caused by: org.apache.solr.common.SolrException: Could not load
driver: com.mysql.jdbc.Driver at
org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:112)
at
org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:65)
at
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:306)
... 35 more Caused by: java.lang.ClassNotFoundException: Unable to load
com.mysql.jdbc.Driver or
org.apache.solr.handler.dataimport.com.mysql.jdbc.Driver at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:587)
at
org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:110)
... 37 more Caused by: org.apache.solr.common.SolrException: Error loading
class 'com.mysql.jdbc.Driver' at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:273)
at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:577)
... 38 more Caused by: java.lang.ClassNotFoundException:
com.mysql.jdbc.Driver at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1387)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1233)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:374) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:242) at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257)
... 39 more 

Re: facet results in order of rank

2009-04-27 Thread Gene Campbell
Thanks for the reply

Your thoughts are what I initially was thinking.  But, given some more
consideration, I imagined a system that would take all the docs that
would be returned for a given facet, and get an average score based on
their scores from the original search that produced the facets.  This
would be the facet values rank.  So, a higher ranked facet value would
be more likely to return higher ranked results.

The idea is that if you want a broad loose search over a large
dataset, and you order the results based on rank, so you get the most
relevant results at the top, e.g. the first page in a search engine
website.  You might have pages and pages of results, but it's the
first few pages of results that are highly ranked that most users
generally see.  As the relevance tapers off, then generally do another
search.

However, if you compute facet values on these results, you have no way
of knowing if one facet value for a field is more or less likely to
return higher scored, relevant records for the user.  You end up
getting facet values that match records that is often totally
irrelevant.

We can sort by Index order, or Count of docs returned.  Would I would
like is a sort based on Score, such that it would be
sum(scores)/Count.

I would assume that most users would be interested in the higher
ranked ones more often.  So, a more efficient UI could be built to
show just the high ranked facets on this score, and provide a control
to show all the facets (not just the high ranked ones.)

Does this clear up my post at all?

Perhaps this wouldn't be too hard for me to implement.  I have lots of
Java experience, but no experience with Lucene or Solr code.
thoughts?

thanks
gene




On Tue, Apr 28, 2009 at 10:56 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 On Fri, Apr 24, 2009 at 12:25 PM, ristretto.rb ristretto...@gmail.comwrote:

 Hello,

 Is it possible to order the facet results on some ranking score?
 I've had a look at the facet.sort param,
 (
 http://wiki.apache.org/solr/SimpleFacetParameters#head-569f93fb24ec41b061e37c702203c99d8853d5f1
 )
 but that seems to order the facet either by count or by index value
 (in my case alphabetical.)


 Facets are not ranked because there is no criteria for determining relevancy
 for them. They are just the count of documents for each term in a given
 field computed for the current result set.



 We are facing a big number of facet results for multiple termed
 queries that are OR'ed together.  We want to keep the OR nature of our
 queries,
 but, we want to know which facet values are likely to give you higher
 ranked results.  We could AND together the terms, to get the facet
 list to be
 more manageable, but we would be filtering out too many results.  We
 prefer to OR terms and let the ranking bring the good stuff to the
 top.

 For example, suppose we have a index of all known animals and
 each doc has a field AO for animal-origin.

 Suppose we search for:  wolf grey forest Europe
 And generate facets AO.  We might get the following
 facet results:

 For the AO field, lots of countries of the world probably have grey or
 forest or wolf or Europe in their indexing data, so I'm asserting we'd
 get a big list here.
 But, only some of the countries will have all 4 terms, and those are
 the facets that will be the most interesting to drill down on.  Is
 there
 a way to figure out which facet is the most highly ranked like this?


 Suppose 10 documents match the query you described. If you facet on AO, then
 it would just go through all the terms in AO and give you the number of
 documents which have that term. There's no question of relevance at all
 here. The returned documents themselves are of course ranked according to
 the relevancy score.

 Perhaps I've misunderstood the query?

 --
 Regards,
 Shalin Shekhar Mangar.



highlighting html content

2009-04-27 Thread Matt Mitchell
Hi,

I've been looking around but can't seem to find any clear instruction on how
to do this... I'm storing html content and would like to enable highlighting
on the html content. The problem is that the search can sometimes match html
element names or attributes, and when the highlighter adds the highlight
tags, the html is bad.

I've been toying with setting custom pre/post delimiters and then removing
them in the client, but I thought I'd ask the list before I go to far with
that idea :)

Thanks,
Matt


Re: Solr 1.4 Release Date

2009-04-27 Thread Otis Gospodnetic

Gurjot, please see http://wiki.apache.org/solr/Solr1.4 - we are currently 33 
JIRA issues away.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Gurjot Singh gurjotas...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, April 27, 2009 12:45:32 PM
 Subject: Solr 1.4 Release Date
 
 Hi, I am curious to know when is the scheduled/tentative release date of
 Solr 1.4.
 
 Thanks,
 Gurjot



Re: Term highlighting with MoreLikeThisHandler?

2009-04-27 Thread Otis Gospodnetic

Eric,

Have you tried using MLT with parameters described on 
http://wiki.apache.org/solr/HighlightingParameters ?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Eric Sabourin eric.sabourin2...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, April 27, 2009 10:31:38 AM
 Subject: Term highlighting with MoreLikeThisHandler?
 
 I submit a query to the MoreLikeThisHandler to find documents similar to a
 specified document.  This works and I've configured my request handler to
 also return the interesting terms.
 
 Is it possible to have MLT return to me highlight snippets in the similar
 documents it returns? I mean generate hl snippets of the interesting terms?
 If so how?
 
 Thanks... Eric



Re: SOLRizing advice?

2009-04-27 Thread Otis Gospodnetic

My turn to help, Paul.

There is no such page on the Solr Wiki, but I agree with Paul, this can really 
be a quick and painless migration for typical Lucene/Solr setups.  This is 
roughly how I'd do things:

- I'd set up Solr
- I'd create the schema.xml mimicking the fields in the existing Lucene index
- I'd copy over the Lucene index, keeping in mind Lucene jar versions, 
Solr/Lucene jar versions, and index compatibility
- Start Solr
- Go to Admin page and run test queries
- Go to schema/solrconfig.xml and add various other things - proper cache 
sizes, index replication, dismax, spellchecker, etc.

- Go to Lucene-based indexer classes and change them to use Solrj
- Go to Lucene-based searcher classes and change them to use Solrj

I'd leave embedded Solr and dynamic fields for phase 2 of the migration, unless 
those things really are necessary.
I don't think you'd need to do anything with web.xml - solr comes as a webapp, 
packaged in a way, which contains its own web.xml

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Paul Libbrecht p...@activemath.org
 To: solr-user@lucene.apache.org
 Sent: Monday, April 27, 2009 12:35:59 AM
 Subject: SOLRizing advice?
 
 
 Hello list,
 
 I am surely not the only one who wishes to migrate from bare lucene to solr.
 Many different reasons can be there, e.g. facetting, web-externalization, 
 ease 
 of update... what interests me here are the steps needed in the form of 
 advice 
 as to what to use.
 
 Here's a few hints. I would love a web-page grouping all these:
 
 - first change references to indexwriter/indexreader/indexsearch to be those 
 of 
 SOLR using embedded-solr-server
 
 - make a first solr schema with appropriate analyzers by defining particular 
 dynamic fields
 
 - slowly replace the queries methods with solr queries, slowly taking 
 advantage 
 of solr features
 
 - web-expose the solr core for at least admin by merging the web.xml
 
 Does such a web-page already exist?
 
 thanks in advance
 
 paul



Re: Fwd: Question about MoreLikeThis

2009-04-27 Thread Otis Gospodnetic

Hello,

Well, if you want documents similar to a specific document, then just make sure 
the query (q) matches that one document.  You can do that by using the 
uniqueKey field in the query, e.g. q=id:123 .  Then you will get documents 
similar to that one document that matched your id:123 query.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: jli...@gmail.com jli...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sunday, April 26, 2009 5:49:26 AM
 Subject: Fwd: Question about MoreLikeThis
 
 I think I understand it now. It means to return MoreLikeThis
 docs for every doc in the result.
 
 ===8==Original message text===
 Hi, I have a question about what MoreLikeThis means - I suppose
 it means get more documents that are similar to _this_ document.
 So I expect the query always take a known document as argument.
 I wonder how I should interpret this query:
 
 http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score
 
 It doesn't seem to specify a document. So what's the This in
 MoreLikeThis in this case? Or, this means something else, and
 not a document?



half width katakana

2009-04-27 Thread Ashish P

I want to convert half width katakana to full width katakana. I tried using
cjk analyzer but not working.
Does cjkAnalyzer do it or is there any other way??
-- 
View this message in context: 
http://www.nabble.com/half-width-katakana-tp23270186p23270186.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DataImportHandler Questions-Load data in parallel and temp tables

2009-04-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is an issue already to write to the index in a separate thread.

https://issues.apache.org/jira/browse/SOLR-1089

On Tue, Apr 28, 2009 at 4:15 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 On Tue, Apr 28, 2009 at 3:43 AM, Amit Nithian anith...@gmail.com wrote:

 All,
 I have a few questions regarding the data import handler. We have some
 pretty gnarly SQL queries to load our indices and our current loader
 implementation is extremely fragile. I am looking to migrate over to the
 DIH; however, I am looking to use SolrJ + EmbeddedSolr + some custom stuff
 to remotely load the indices so that my index loader and main search engine
 are separated.


 Currently if you want to use DIH then the Solr master doubles up as the
 index loader as well.



 Currently, unless I am missing something, the data gathering from the
 entity
 and the data processing (i.e. conversion to a Solr Document) is done
 sequentially and I was looking to make this execute in parallel so that I
 can have multiple threads processing different parts of the resultset and
 loading documents into Solr. Secondly, I need to create temporary tables to
 store results of a few queries and use them later for inner joins was
 wondering how to best go about this?

 I am thinking to add support in DIH for the following:
 1) Temporary tables (maybe call it temporary entities)? --Specific only to
 SQL though unless it can be generalized to other sources.


 Pretty specific to DBs. However, isn't this something that can be done in
 your database with views?



 2) Parallel support


 Parallelizing import of root-entities might be the easiest to attempt.
 There's also an issue open to write to Solr (tokenization/analysis) in a
 separate thread. Look at https://issues.apache.org/jira/browse/SOLR-1089

 We actually wrote a multi-threaded DIH during the initial iterations. But we
 discarded it because we found that the bottleneck was usually the database
 (too many queries) or Lucene indexing itself (analysis, tokenization) etc.
 The improvement was ~10% but it made the code substantially more complex.

 The only scenario in which it helped a lot was when importing from HTTP or a
 remote database (slow networks). But if you think it can help in your
 scenario, I'd say go for it.



  - Including some mechanism to get the number of records (whether it be
 count or the MAX(custom_id)-MIN(custom_id))


 Not sure what you mean here.



 3) Support in DIH or Solr to post documents to a remote index (i.e. create
 a
 new UpdateHandler instead of DirectUpdateHandler2).


 Solrj integration would be helpful to many I think. There's an issue open.
 Look at https://issues.apache.org/jira/browse/SOLR-853

 --
 Regards,
 Shalin Shekhar Mangar.




-- 
--Noble Paul


Re: MacOS Failed to initialize DataSource:db+ DataimportHandler ???

2009-04-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
apparently you do not have the driver in the path. drop your driver
jar into ${solr.home}/lib

On Tue, Apr 28, 2009 at 4:42 AM, gateway0 reiterwo...@yahoo.de wrote:

 Hi,

 sure:
 
 message Severe errors in solr configuration. Check your log files for more
 detailed information on what may be wrong. If you want solr to continue
 after configuration errors, change:
 abortOnConfigurationErrorfalse/abortOnConfigurationError in null
 -
 org.apache.solr.common.SolrException: FATAL: Could not create importer.
 DataImporter config invalid at
 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114)
 at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:311)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:480) at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
 at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
 at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
 at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
 at org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
 at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:525) at
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:627)
 at
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
 at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488) at
 org.apache.catalina.startup.HostConfig.start(HostConfig.java:1149) at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
 at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
 at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at
 org.apache.catalina.core.StandardHost.start(StandardHost.java:719) at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
 org.apache.catalina.core.StandardService.start(StandardService.java:516) at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at
 org.apache.catalina.startup.Catalina.start(Catalina.java:578) at
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585) at
 org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288) at
 org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413) Caused by:
 org.apache.solr.handler.dataimport.DataImportHandlerException: Failed to
 initialize DataSource: mydb Processing Document # at
 org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:308)
 at
 org.apache.solr.handler.dataimport.DataImporter.addDataSource(DataImporter.java:273)
 at
 org.apache.solr.handler.dataimport.DataImporter.initEntity(DataImporter.java:228)
 at
 org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:98)
 at
 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106)
 ... 31 more Caused by: org.apache.solr.common.SolrException: Could not load
 driver: com.mysql.jdbc.Driver at
 org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:112)
 at
 org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:65)
 at
 org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:306)
 ... 35 more Caused by: java.lang.ClassNotFoundException: Unable to load
 com.mysql.jdbc.Driver or
 org.apache.solr.handler.dataimport.com.mysql.jdbc.Driver at
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:587)
 at
 org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:110)
 ... 37 more Caused by: org.apache.solr.common.SolrException: Error loading
 class 'com.mysql.jdbc.Driver' at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:273)
 at
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:577)
 ... 38 more Caused by: java.lang.ClassNotFoundException:
 com.mysql.jdbc.Driver at
 org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1387)
 at
 org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1233)
 at 

Re: half width katakana

2009-04-27 Thread Koji Sekiguchi

Ashish P wrote:

I want to convert half width katakana to full width katakana. I tried using
cjk analyzer but not working.
Does cjkAnalyzer do it or is there any other way??
  


CharFilter which comes with trunk/Solr 1.4 just covers this type of problem.
If you are using Solr 1.3, try the patch attached below:

https://issues.apache.org/jira/browse/SOLR-822

Koji




Re: half width katakana

2009-04-27 Thread Ashish P

After this should I be using same cjkAnalyzer or use charFilter??
Thanks,
Ashish


Koji Sekiguchi-2 wrote:
 
 Ashish P wrote:
 I want to convert half width katakana to full width katakana. I tried
 using
 cjk analyzer but not working.
 Does cjkAnalyzer do it or is there any other way??
   
 
 CharFilter which comes with trunk/Solr 1.4 just covers this type of
 problem.
 If you are using Solr 1.3, try the patch attached below:
 
 https://issues.apache.org/jira/browse/SOLR-822
 
 Koji
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/half-width-katakana-tp23270186p23270453.html
Sent from the Solr - User mailing list archive at Nabble.com.