Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Can you try this: Issue a full import command like this: http://localhost:8983/solr/dataimport?command=full-import (There is no core name here - if you're using a core name (db?), then add that in between solr/ and /dataimport) then,

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
I´m using the trunk. Thanks Peter for your preoccupation! Matias. 2011/3/4 Peter Sturge > Hi Matias, > > What version of Solr are you using? Are you running any patches (maybe > SOLR-2245)? > > Thanks, > Peter > > > > On Fri, Mar 4, 2011 at 8:25 PM, Matias Alonso >wrote: > > > Hi Peter, > >

Re: Problem adding new requesthandler to solr branch_3x

2011-03-04 Thread Koji Sekiguchi
If this amended to read: true the solr-example starts fine. Paul, It should be true. Koji -- http://www.rondhuit.com/en/

Re: Help on Multi-language support

2011-03-04 Thread cyang2010
This is the solr schema: -- View this message in context: http://lucene.472066.n3.nabble.com/Help-on-Multi-language-support-tp2636054p2636065.html Sent from the Solr - User mailing list archive at Nabble.com.

Help on Multi-language support

2011-03-04 Thread cyang2010
Hi, I wonder how solr can satisfy our multi-language requirement. For example, for movie/tv series titles, We require that based on user preferred language, user is able to get back titles name (and actor, directors) by selected language. For example, getTitlesByGenreId. On the other hand,

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi Matias, What version of Solr are you using? Are you running any patches (maybe SOLR-2245)? Thanks, Peter On Fri, Mar 4, 2011 at 8:25 PM, Matias Alonso wrote: > Hi Peter, > > From "DataImportHandler Development Console" I made a full-import, but > didn´t work. > > Now, I execute " > http://

RE: Help please - recursively indexing lots and lots of text files

2011-03-04 Thread Steven A Rowe
Hi Colin, Solr's DataImportHandler sounds like what you want: http://wiki.apache.org/solr/DataImportHandler In particular, take a look at FileListEntityProcessor: http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor Steve > -Original Message- > From: c

Help please - recursively indexing lots and lots of text files

2011-03-04 Thread csm
Hi, I'm new to Lucene/Solr and I'm trying to build an index of a large body of plaintext files for some corpus research that I'm doing. There are about 37,000 files of typically 50-100 lines each, and they're scattered throughout a huge nested directory structure. I've worked through the basic S

Re: When Index is Updated Frequently

2011-03-04 Thread Michael McCandless
On Fri, Mar 4, 2011 at 3:22 PM, Dennis Gearon wrote: > Nearly 100ms? If any netizen ever complained about that, I'd 'round-file' the > complaint. Internal to a single process's execution, well, mabye it's an > issue. > Not too hard to handle. Well there are many caveats, but 100 msec is where (

Re: When Index is Updated Frequently

2011-03-04 Thread Michael McCandless
On Fri, Mar 4, 2011 at 3:21 PM, Bing Li wrote: > I have a question. If Lucene is good at updating, it must more loads on the > Solr cluster. So in my system, I will leave the large amount of crawled data > unchanged for ever. Meanwhile, I use a traditional database to keep mutable > data. > > For

DIH - Multiple Cores / Consistent Hashing

2011-03-04 Thread Frederik Kraus
Hi Guys, I'm currently working on a project with quite a few shards/cores etc. and ideally want to use the DIH to the indexing. Is there any consistent hashing method available, other than the modulo way of selecting only specific documents. Thanks, Fred.

RE: When Index is Updated Frequently

2011-03-04 Thread Jonathan Rochkind
If you can make that solution work for you, I think it is a wise one which will serve you well. In some cases that solution won't work, because you _need_ the frequently changing data in Solr to be searched against in Solr. But if you can get away without that, I think you will be well-served b

Drop documents when indexing with DHI

2011-03-04 Thread Rosa (Anuncios)
Hi, Is it possible to skip document when indexing with DHI based on a regex to filter certain "badwords" for example? Thanks for your help, rosa

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
Hi Peter, >From "DataImportHandler Development Console" I made a full-import, but didn´t work. Now, I execute " http://localhost:8983/solr/mail/dataimport?command=full-import"; but nothing happends; no index; no errors. thks... Matias. 2011/3/4 Peter Sturge > Hi Mataias, > > > http://local

Re: When Index is Updated Frequently

2011-03-04 Thread Dennis Gearon
Nearly 100ms? If any netizen ever complained about that, I'd 'round-file' the complaint. Internal to a single process's execution, well, mabye it's an issue. Not too hard to handle. Good job to the team that made it! From: Michael McCandless To: solr-user@lucene.apache.org; bing...@asu.edu C

Re: When Index is Updated Frequently

2011-03-04 Thread Bing Li
Dear Michael, Thanks so much for your answer! I have a question. If Lucene is good at updating, it must more loads on the Solr cluster. So in my system, I will leave the large amount of crawled data unchanged for ever. Meanwhile, I use a traditional database to keep mutable data. Fortunately, in

Problem adding new requesthandler to solr branch_3x

2011-03-04 Thread Paul Rogers
Dear All Following on from: http://lucene.472066.n3.nabble.com/Problem-with-Solr-and-Nutch-integration-tp2590334p2601915.html I'm trying to add a new request handler to solr (the branch_3x checked out from svn.  The request handler is as follows:                dismax        explicit        0.0

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi Mataias, http://localhost:8983/solr/mail/admin/dataimport.jsp?handler=/dataimportaccesses the dataimport handler, but you need to tell it to do something by sending a command: http://localhost:8983/solr/mail/admin/dataimport.jsp?handler=/dataimport ?command=full-import

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
Hi Peter, I test with deltaFetch="false", but doesn´t work :( I'm using "DataImportHandler Development Console" to index ( http://localhost:8983/solr/mail/admin/dataimport.jsp?handler=/dataimport); I'm working with "example-DIH". thks... 2011/3/4 Peter Sturge > Hi Matias, > > I haven't seen

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi Matias, I haven't seen it in the posts, but I may have missed it -- what is the import command you're sending? Something like: http://localhost:8983/solr/db/dataimport?command=full-import Can you also test it with deltaFetch="false". I seem to remember having some problems with delta in the Ma

Re: When Index is Updated Frequently

2011-03-04 Thread Michael McCandless
On Fri, Mar 4, 2011 at 10:09 AM, Bing Li wrote: > According to my experiences, when the Lucene index updated frequently, its > performance must become low. Is it correct? In fact Lucene can gracefully handle a high rate of updates with low latency turnaround on the readers, using the near-real-t

Re: Index xml files with variable structure stored in tree hierarchy

2011-03-04 Thread Walter Underwood
This is very difficult to do with Solr, since it does not support hierarchial data directly. I would recommend a database that handles XML natively. You can try eXist (open source) or MarkLogic (commercial, I work there). wunder Walter Underwood Lead Engineer, MarkLogic On Mar 4, 2011, at 10:2

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
2011/3/4 Peter Sturge > Hi Matias, > > Can you post your data-config.xml? (with disquised names/credentials) > > Thanks, > Peter > > > On Fri, Mar 4, 2011 at 5:13 PM, Matias Alonso >wrote: > > > Thks Peter, > > > > Yes, gmail gives me imaps (i understood that). So, I tried what you

Index xml files with variable structure stored in tree hierarchy

2011-03-04 Thread Andriy Kurochenko
Hello, I need to index plenty of xml files in tree directory structure with preservation of elements names, elements attributes and values. So i can for example search for occurrence of value of element or presence of element with given name whose attribute with given name has given value. I also

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi Matias, Can you post your data-config.xml? (with disquised names/credentials) Thanks, Peter On Fri, Mar 4, 2011 at 5:13 PM, Matias Alonso wrote: > Thks Peter, > > Yes, gmail gives me imaps (i understood that). So, I tried what you mention > but I had get the original mesange I posted. > > M

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
Thks Peter, Yes, gmail gives me imaps (i understood that). So, I tried what you mention but I had get the original mesange I posted. Matias. 2011/3/4 Peter Sturge > Hi, > > You need to put your password in as well. You should use protocol="imap" > unless your gmail is set for imaps (I don't

Re: Help -DIH (mail)

2011-03-04 Thread Matias Alonso
Thks Gora, I forgot "s". Now, doesn´t have error, but not index. This´s the answer in my command line (windows :( ). .. .. .. protocol : imaps host : imap.gmail.com folders : Recibidos,recibidos,RECIBIDOS,inbox.InBox,INBOX,Mail,MAIL,mai

Trying to use FieldReaderDataSource in DIH

2011-03-04 Thread Jeff Schmidt
Hello: I'm trying to make use of FieldReaderDataSource so that I can read a (Oracle) database CLOB, and then use XPathEntityProcessor to derive Solr field values via xpath notation. For an extra bit of fun, the CLOB itself is base 64 encoded and gzip'd. I created a transformer of my own to ta

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi, You need to put your password in as well. You should use protocol="imap" unless your gmail is set for imaps (I don't believe the free gmail gives you this). HTH Peter On Fri, Mar 4, 2011 at 4:42 PM, Gora Mohanty wrote: > On Fri, Mar 4, 2011 at 9:20 PM, Matias Alonso > wrote: > > H

Re: Help -DIH (mail)

2011-03-04 Thread Gora Mohanty
On Fri, Mar 4, 2011 at 9:20 PM, Matias Alonso wrote: > Hi everyone! > > >  I’m trying to index mails into solr through DHI (based on the > “example-DIH”). For this I´m using my personal email from gmail, but I can´t > index. Have not used the MailEntityProcessor with Gmail, but some points below:

RE: Model foreign key type of search?

2011-03-04 Thread Jonathan Rochkind
Yep, it's tricky to do this sort of thing in Solr. One way to do it would be to try and reindex the main item on some regular basis with the keywords/comments actually flattened into the main record. Maybe along with a field for number_of_comments, so you can boost on that or what have you. I

Help -DIH (mail)

2011-03-04 Thread Matias Alonso
Hi everyone! I’m trying to index mails into solr through DHI (based on the “example-DIH”). For this I´m using my personal email from gmail, but I can´t index. Configuration in Data-config .xml: When I debug and verbose I get the following messange (only first lines): o

Solr integration with Jackrabbit jcr repository.

2011-03-04 Thread kniels21
Hi all, I would like to leverage the Solr search platform to index and search content from a jcr repository managed by Jackrabbit. I am aware that Jackrabbit uses a built in Lucene index for indexing node data. I would like to make use of Solr's search features (caching,faceting etc) instead of

Re: Solr chained exclusion query

2011-03-04 Thread Peter Sturge
Hi, Oh, how I wish it was as simple as that! :-) The tricky ingredient in the use case is to exclude all documents (from any 'saledate') if there's a "recent" 'product' match (e.g. last month). So, essentially you have to somehow build a query that looks at 2 different criteria for the same field

When Index is Updated Frequently

2011-03-04 Thread Bing Li
Dear all, According to my experiences, when the Lucene index updated frequently, its performance must become low. Is it correct? In my system, most data crawled from the Web is indexed and the corresponding index will NOT be updated any more. However, some indexes should be updated frequently li

dismax, and too much qf?

2011-03-04 Thread Jeff Schmidt
Hello: I'm working on implementing a requirement where when a document is returned, we want to pithily tell the end user why. That is, say, with five documents returned, they may be so for similar or different reasons. These "reasons" are the field(s) in which matches occurred. Some are more i

Re: Max Document Size

2011-03-04 Thread Jayendra Patil
Hi Sean, If you are using the remote streaming, In the SolrConfig there is a limit to the max file size that can be uploaded. You might want to check on that. Regards, Jayendra On Thu, Mar 3, 2011 at 5:58 PM, Sean Todd wrote: > Is there a maximum document size that Solr can handle?  I'm tryin

Re: Solr chained exclusion query

2011-03-04 Thread Savvas-Andreas Moysidis
Can you not calculate on the fly when the date which is one month before the current is and use that as your upper limit? e.g. taking today as an example your upper limit would be 20011-02-04T00:00:00Z and so your query would be something like: q=products:Dog AND saledate:[* TO 20011-02-04T00:00:0

Full Text Search with multiple index and complex requirements

2011-03-04 Thread Shrinath M
We are building an application which will require us to index data for each of our users so that we can provide full text search on their data. Here are some notable things about the application: A) The data for every user is totally unrelated to every other user. This gives us few advantages:

Solr chained exclusion query

2011-03-04 Thread Peter Sturge
Hello, I've been wrestling with a query use case, perhaps someone has done this already? Is it possible to write a query that excludes results based on another query? Scenario: I have an index that holds: 'customer' (textgen) 'product' (textgen) 'saledate' (date) I'm looking to ret

Re: perfect match in dismax search

2011-03-04 Thread Gastone Penzo
Good work. It's a great idea 2011/3/3 Jan Høydahl > Hi, > > I'm working on a Filter which enables boundary match using syntax > title:"^hello I love you$" > which will make sure that the match is exact. See SOLR-1980 (no working > patch yet) > > -- > Jan Høydahl, search solution architect > Comi

Re: Out of memory while creating indexes

2011-03-04 Thread Praveen Parameswaran
Hi , post.sh is using curl as I see , will that be helpful? On Fri, Mar 4, 2011 at 1:24 PM, Upayavira wrote: > post.jar is intended for demo purposes, not production use, so it > doesn;t surprise me you've managed to break it. > > Have you tried using curl to do the post? > > Upayavira > > On T

Re: adding a document using curl

2011-03-04 Thread Marc SCHNEIDER
Hi, Could you please post exactly what you tried? Regards, On Thu, Mar 3, 2011 at 12:31 PM, Ken Foskey wrote: > > I have read the various pages and used Curl a lot but i cannot figure out > the correct command line to add a document to the example Solr instance. > > I have tried a few things h