DIH, multiple sources, cores and search: single core with multiple entities or single core per source with search across multiple cores?

2017-07-24 Thread Giovanni De Stefano
Hello guys, I need to index content coming from different sources (db, filesystems, …). Those sources share most fields, only a few are specific to the source. Content coming from different sources changes at different rates. Some sources will generate hundreds of thousands of documents, some othe

Re: Antw: Re: How to Debug Solr With Eclipse

2017-07-14 Thread Giovanni De Stefano
-6.0.0` folder (leave all options the way they are) 4) wait a few minutes…it takes a while to build the whole thing, in the meantime it’s normal to see “errors” or “warning”… I hope it helps, Giovanni > On 14 Jul 2017, at 16:01, Rainer Gnan wrote: > > Hi Giovanni, > > thank you fo

Re: How to Debug Solr With Eclipse

2017-07-13 Thread Giovanni De Stefano
ant eclipse and take it from there: the script will tell you what to do next). I hope it helps! Cheers, Giovanni > On 13 Jul 2017, at 19:54, govind nitk wrote: > > Hi, > > Solr has releases, kindly checkout to the needed one. > > > cheers > > On Thu, Jul 13,

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Giovanni De Stefano
Thank you guys for your advice! I would rather take advantage as much as possible of the existing handlers/processors. I just realised that nested entities in DIH is extremely slow: I fixed that with a view on the DB (that does a join between 2 tables). The other thing I have to do is chain th

How to "chain" import handlers: import from DB and from file system

2017-07-09 Thread Giovanni De Stefano
Hello all, I have to index (and search) data organised as followed: many files on the filesystem and each file has extra metadata stored on a DB (the DB table has a reference to the file path). I think I should have 1 Solr document per file with fields coming from both the DB (through DIH) and

When does Solr plan to update its embedded Apache Tika version?

2016-02-02 Thread Giovanni Usai
more recent one? Just for your information, we are embedding Solr in our open source product "Datafari" and we are defining a new Parser that needs a newer version of Tika. Thanks and Best regards, *Giovanni Usai * giovanni.u...@francelabs.com www.francelabs.com <http://www.f

is group.query supported in solrcloud (4.8) ?

2014-11-10 Thread Giovanni Bricconi
ping ] the group.query is supported. Am I missing some key parameter? Should the shards parameter be really mandatory? It seems that with group.field it is not required. Thanks Giovanni

grouping finds

2014-11-06 Thread Giovanni Bricconi
Sorry for the basic question q=*:*&fq=-sku:2471834&fq=FiltroDispo:1&fq=has_image:1&rows=100&fl=descCat3,IDCat3,ranking2&group=true&group.field=IDCat3&group.sort=ranking2+desc&group.ngroups=true returns some groups with no results. I'm using solr 4.8.0, the collection has 3 shards Am I missing so

Re: unstable results on refresh

2014-10-23 Thread Giovanni Bricconi
st setup, is it worth the effort? > > Best, > Erick > > On Wed, Oct 22, 2014 at 3:54 AM, Giovanni Bricconi > wrote: > > I have made some small patch to the application to make this problem less > > visible, and I'm trying to perform the optimize once per hour,

Re: unstable results on refresh

2014-10-22 Thread Giovanni Bricconi
ok to have the indexing process a little bit slower. 2014-10-21 18:44 GMT+02:00 Erick Erickson : > Giovanni: > > To see how this happens, consider a shard with a leader and two > followers. Assume your autocommit interval is 60 seconds on each. > > This interval can expire at sl

Re: unstable results on refresh

2014-10-21 Thread Giovanni Bricconi
oblem related to indexing, only some malformed query. After doing an optimize the problem disappeared. So, is the problem related to documents that where deleted from the index? The optimization took 5 minutes to complete 2014-10-21 11:41 GMT+02:00 Giovanni Bricconi : > Nice! > I will monito

Re: unstable results on refresh

2014-10-21 Thread Giovanni Bricconi
Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros : > Hi Giovanni, > > we had this problem as well. > The cause was that t

Re: unstable results on refresh

2014-10-21 Thread Giovanni Bricconi
Alex. > Personal: http://www.outerthoughts.com/ and @arafalov > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 > > > On 20 October 2014 04:49, Giovanni Bricconi > wrote: > > Hel

unstable results on refresh

2014-10-20 Thread Giovanni Bricconi
core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni

Re: solrcloud "indexing completed" event

2014-07-01 Thread Giovanni Bricconi
ers, whatever. They'll return the same _documents_, > but.... > > FWIW, > Erick > > On Mon, Jun 30, 2014 at 7:55 AM, Giovanni Bricconi > wrote: > > Hello > > > > I have one application that queries solr; when the index version changes > > this appli

solrcloud "indexing completed" event

2014-06-30 Thread Giovanni Bricconi
o capture the "commit done on every core of the collection" event? Thank you Giovanni

Re: solr cloud 4.8, synonymfilterfactory and big dictionaries

2014-05-14 Thread Giovanni Bricconi
Thank you Elaine, splitted files worked for me too. 2014-05-06 19:15 GMT+02:00 Cario, Elaine : > Hi Giovanni, > > I had the same issue just last week! I worked around it temporarily by > segmenting the file into < 1 MB files, and then using a comma-delimited > list of f

solr cloud 4.8, synonymfilterfactory and big dictionaries

2014-05-06 Thread Giovanni Bricconi
Hello I am migrating an application to solrcloud and I have to deal with a big dictionary, about 10Mb It seems that I can't upload it to zookeper, is there a way of specifying an external file for the synonyms parameter? can I compress the file or split it in many small files? I have the same p

Re: Solr relevancy tuning

2014-04-11 Thread Giovanni Bricconi
engine improvement. Thank you Doug! 2014-04-09 17:48 GMT+02:00 Doug Turnbull < dturnb...@opensourceconnections.com>: > Hey Giovanni, nice to meet you. > > I'm the person that did the Test Driven Relevancy talk. We've got a product > Quepid (http://quepid.com) that le

Re: Solr relevancy tuning

2014-04-09 Thread Giovanni Bricconi
nning of the book. Giovanni 2014-04-09 12:59 GMT+02:00 Ahmet Arslan : > Hi Giovanni, > > Here are some relevant pointers : > > > http://www.lucenerevolution.org/2013/Test-Driven-Relevancy-How-to-Work-with-Content-Experts-to-Optimize-and-Maintain-Search-Relevancy > > >

Solr relevancy tuning

2014-04-09 Thread Giovanni Bricconi
suggest me any other practices you are using on your projects? Thank you very much in advance Giovanni

question about synonymfilter

2013-12-23 Thread Giovanni Bricconi
hello suppose I have this synonim abxpower => abx power and suppose you are indexing "abxpower pipp" >From the analyzer I see that abxpower is splitted in two words, but the second word "power" overlaps the next one text raw_bytes keyword position start end type positionLength abxpower [61 62 78

Re: Data Import Handler

2013-11-06 Thread Giovanni
I configured a data source in tomcat and referenced it by its jdbc name. So dev and production sites shares the same config file but uses different dbs I hope this helps > Il giorno 06/nov/2013, alle ore 13:25, "Ramesh" > ha scritto: > > Hi Folks, > > > > Can anyone suggest me how can cu

Re: Data import handler with multi tables

2013-10-29 Thread Giovanni Bricconi
maybe So you can keep the original id, maybe add also an originalTable field if you don't like parsing the id colum to discover the table from which the data was read. 2013/10/29 Stefan Matheis > I'v

howto increase indexing speed?

2013-10-16 Thread Giovanni Bricconi
ou for any hints Giovanni

Re: Using a dictionary to boost queries

2013-07-30 Thread Giovanni Bricconi
Maybe you can try with synonyms add a to the field type you are using for text and then place habeas corpus => habeascorpusxx int the special_words.txt file then reindex some documents and try some queries with debugQuery=true. remember to reload the core when changing configuration. 2013

Re: ClassNotFoundException regarding SolrInfoMBean under Tomcat 7

2013-07-05 Thread Giovanni Bricconi
I saw something similar when I placed some jar in tomcat/lib (data import handler), the right place was instead WEB-INF/lib. I would try placing al needed jars there. 2013/7/5 Michael Bakonyi > Hm, can't anybody help me out? I still can't get my installation run > correctly ... > > What I've fo

Re: Is it possible to find a leader from a list of cores in solr via java code

2013-07-03 Thread Giovanni Bricconi
what I saw with solr 4.2.1 Giovanni 2013/7/3 Erick Erickson > You can always query Zookeeper and find that information out. > Take a look at CloudSolrServer, maybe ZkCoreNodeProps etc. > for examples since CloudSolrServer is "leader aware", it > should have some clues... >

Replicas and soft commit

2013-06-14 Thread Giovanni Bricconi
am I misunderstanding how to use this feature? I don't see soft commit propagation to replicas when sending update to the indexer only: is this true or maybe I haven't changed some configuration files when porting the application to solr4? Giovanni

custom facet.sort

2013-05-07 Thread Giovanni Bricconi
I also have 3/4 other custom sorting to implement Is it possible to plug in a custom java class to provide custom facet.sort modes? Thank you Giovanni

Re: Solr and OpenPipe

2013-03-28 Thread Giovanni Bricconi
Bella lì! vedo che ci divertiamo Il giorno 28/mar/2013 17:11, "Fabio Curti" ha scritto: > git clone https://github.com/kolstae/openpipe > cd openpipe > mvn install > > regards > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-and-OpenPipe-tp484777p4052079.html

Re: solr 4 plugins

2012-12-23 Thread Giovanni Bricconi
This is really interesting! Do you know if these added fields can be used in sorting or faceting? Tanks Il giorno 23/dic/2012 14:08, "Otis Gospodnetic" ha scritto: > Hi, > > Look into writing a custom SearchComponent. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Dec 23, 201

Re: are stopwords indexed?

2012-07-16 Thread Giovanni Gherdovich
-terms, but... they were all there. Michael: > Hi Giovanni, > > you have entered the stopwords into stopword.txt file, right? But in the > definition of the field type you are referencing stopwords_FR.txt.. good catch Micheal, but that's not the problem. In my message I referred to &quo

are stopwords indexed?

2012-07-15 Thread Giovanni Gherdovich
ttp://manpages.ubuntu.com/manpages/natty/man1/lucli.1.html show that all stopwords are in my index. Note that I query LuCLI specifying the field, i.e. with "myFieldName:and" and not just with the stopword "and". Is this normal? Are stopwords indexed? Cheers, Giovanni

Re: documentation on the pragmatics behind the example schema.xml

2012-06-30 Thread Giovanni Gherdovich
Hello Eric, 2012/7/1 Erick Erickson : > Your very best way of figuring this out is to use the admin/analysis > page. [...] thank you for this advice. I'll make myself comfortable with the admin/analysis page. cheers, GGhh

Re: more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
Hi Gora, yes I was actually looking for a multi-core setup. thanks! GGhh 2012/6/30 Gora Mohanty > > Not quite sure what you mean by "more than one > corpus", and by "several independent indices" in > this context, but maybe multi-core Solr will meet > your needs: http://wiki.apache.org/solr/Cor

Re: difference between stored="false" and stored="true" ?

2012-06-30 Thread Giovanni Gherdovich
Thank you François and Jack for those explainations. Cheers, GGhh 2012/6/30 François Schiettecatte: > Giovanni > > means the data is stored in the index and [...] 2012/6/30 Jack Krupansky: > "indexed" and "stored" are independent [...]

Re: how do I trash a whole index and start over?

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Dmitry Kan: > Hello, > > The easiest way is to remove what's inside data/index directory; in case > you have a spell-checker index, remove it as well. This requires solr > instance restart. thanks dmitry, I'll go for this solution. cheers, GGhh

difference between stored="false" and stored="true" ?

2012-06-30 Thread Giovanni Gherdovich
Hi all, when declaring a field in the schema.xml file you can set the attributes 'indexed' and 'stored' to "true" or "false". What is the difference between a and a ? I guess understanding this would require me to have a closer look to lucene's index data structures; what's the pointer to some

documentation on the pragmatics behind the example schema.xml

2012-06-30 Thread Giovanni Gherdovich
Hi all, in the example schema.xml I can find a wide variety of fieldType and field, already there to be used. I believe each of them has been designed for a specific usage case, with some pragmatics in mind. Where can I find documentation on what those field / fieldTypes were designed for? Is th

how do I trash a whole index and start over?

2012-06-30 Thread Giovanni Gherdovich
Hi all, how do I trash a whole index and start over with a new fresh index of my corpus? I need that since I modified my schema.xml since my last indexing, and I'd like the changes to be taken into account. Cheers, Giovanni

Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Giovanni Gherdovich
Sascha: > You should also make sure that the field definition (in schema.xml) for 'text' > says stored="true", otherwise the field will not be returned. I guess you're hitting my problem. The field I want to search on is declared with store=false in the schema.xml: -- -- >8 -- -- >8 -- -- >8 -

Re: querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Erik Hatcher: > Debugging this you can add &debugQuery=true&wt=xml to get > the full classic Solr XML output that drives it all. Thank you Erik, I'll see what I get from it. cheers, GGhh

Re: more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Afroz Ahmad: > You can set up multiple cores, each core managing a different index. > See http://wiki.apache.org/solr/CoreAdmin > thank you very much Ahmad for this hint. cheers, Giovanni

Re: querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
- >8 -- -- >8 resuming: 1) I don't understand what does it mean defaulting 'qf' to 'name' in disMax. I have no field named 'name'. 2) From what I understand, my 'qf' value for disMax should default to 'text', the name of the field I care of. correct? cheers, Giovanni

querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
max *:* 10 *,score name -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 any hint on how I can debug that? cheers, Giovanni

how to retrieve a doc from its docID ?

2012-06-30 Thread Giovanni Gherdovich
-- -- >8 -- -- here is the response if I query for "solar" : -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- 1.0 123 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- which is, solr gives me the doc ID. How to retrieve the doc's field "text" given its id ? cheers, Giovanni

more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
Hi all, i am experimenting with solr, and I feel the need to index more than just one corpus and search them with solr independently. is it possible to have this setup? Several independent indices all managed by the same solr instance? cheers, Giovanni

return *all* words at levenstein distance = N from query word

2012-06-07 Thread Giovanni Gherdovich
Hi all, I am wandering if SOLR can return me all words in my text corpus that have a given levenstein distance with my query word. Possible? Difficult? Cheers, Giovanni

Re: indexing unstructured text (tweets)

2012-05-28 Thread Giovanni Gherdovich
2012/5/28 Jack Krupansky : > Ah, okay. Here's some PHP regexp code for parsing a raw tweet to get user > names and hash tags: > > http://saturnboy.com/2010/02/parsing-twitter-with-regexp/ Awesome! thank you very much Jack. GGhh

Re: indexing unstructured text (tweets)

2012-05-28 Thread Giovanni Gherdovich
(getting the Twitter feeds myself) on my machines, so that I can exploit the whole information content provided by Twitter. Cheers, Giovanni

Re: indexing unstructured text (tweets)

2012-05-28 Thread Giovanni Gherdovich
asking since I am new here to Solr. > Although, I imagine quite a few people have already done this quite a few > times before, so maybe somebody could contribute their Twitter Solr schema. > Anybody? Oh that would be nice :-) Cheers, Giovanni

Re: indexing unstructured text (tweets)

2012-05-28 Thread Giovanni Gherdovich
x27;ll take it easy and do as you suggest. Cheers, Giovanni

indexing unstructured text (tweets)

2012-05-28 Thread Giovanni Gherdovich
explaination about the general picture? Can I index my tweets with Solr? Or do I need to put also Tika in my pipeline? Best regards, Giovanni Gherdovich

When all cores are ready to be used?

2010-12-13 Thread De Stefano, Giovanni, VF-Group
to reinitialize the cores but this doesn't sound right. Basically I would like to have a listener of an event "Solr created everything it needs, including cores, etc". How can I do this? Thanks, Giovanni

RE: Understanding Lucene's File Format

2010-09-17 Thread Giovanni Fernandez-Kincade
e frq/prx pointers, so that on seek we can rebase the decoding. Mike On Fri, Sep 17, 2010 at 10:02 AM, Giovanni Fernandez-Kincade wrote: >> The terms index (once loaded into RAM) has absolute longs, too. > > So in the TermInfo Index(.tii), the FreqDelta, ProxDelta, And SkipDelta &

RE: Understanding Lucene's File Format

2010-09-17 Thread Giovanni Fernandez-Kincade
s. Mike On Thu, Sep 16, 2010 at 3:53 PM, Giovanni Fernandez-Kincade wrote: > Hi, > I've been trying to understand Lucene's file format and I keep getting hung > up on one detail - how can Lucene quickly find the frequency data (or > proximity data) for a particular term?

Understanding Lucene's File Format

2010-09-16 Thread Giovanni Fernandez-Kincade
Hi, I've been trying to understand Lucene's file format and I keep getting hung up on one detail - how can Lucene quickly find the frequency data (or proximity data) for a particular term? According to the file formats page on the Lucene website

RE: FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
bug... Mike On Tue, Apr 27, 2010 at 2:34 PM, Giovanni Fernandez-Kincade wrote: > I was considering it, but we're already tight on memory usage. How do you > configure Solr to use it?  Is this correct? > > http://www.mail-archive.com/solr-user@lucene.apache.org/msg28574.html

RE: FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
ctory.class=org.apache.lucene.store.MMapDirectory -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Tuesday, April 27, 2010 2:28 PM To: solr-user@lucene.apache.org Subject: Re: FSDirectory Synchronization Issues Try MMapDirectory? Mike On Tue, Apr 27, 2010 at 2:09 PM, Giovann

FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
Hello, I'm encountering a lot of contention around SimpleFSDirectory$SimpleFSIndexInput.readInternal, pretty much identical to what this user described back in 2008: http://www.mail-archive.com/solr-user@lucene.apache.org/msg15516.html I also found this JIRA issue, where it appears that the conc

autocommiting with expungeDeletes=true

2010-04-08 Thread Giovanni Fernandez-Kincade
Is there any way to configure autocommit to expungeDeletes? Looking at the code it seems to be that there isn't... >From org.apache.solr.update.DirectUpdateHandler2: public synchronized void run() { long started = System.currentTimeMillis(); try { CommitUpdateCommand command =

RE: PDFBox/Tika Performance Issues

2010-03-23 Thread Giovanni Fernandez-Kincade
nasa.gov] Sent: Tuesday, March 23, 2010 11:03 AM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues Hi Giovanni, The error that you're showing in your logs below indicates that this message signature: org.apache.solr.handler.ContentStreamLoader.load(Lorg

RE: PDFBox/Tika Performance Issues

2010-03-23 Thread Giovanni Fernandez-Kincade
ser@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues What's your configuration look like for the ExtractReqHandler? On Mar 19, 2010, at 2:42 PM, Giovanni Fernandez-Kincade wrote: > Yeah I've been trying that - I keep getting this error when indexing a PDF > with a trunk-b

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
al Message- From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll Sent: Friday, March 19, 2010 1:46 PM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues Can you try trunk? On Mar 19, 2010, at 1:12 PM, Giovanni Fernandez-Kincade wrote: > Solr

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Time:Wed Mar 17 17:05:19 EDT 2010 -Original Message- From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll Sent: Friday, March 19, 2010 1:02 PM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues On Mar 16, 2010, at 6:55 PM, Giovanni Fernandez

RE: PDFBox/Tika Performance Issues

2010-03-19 Thread Giovanni Fernandez-Kincade
Yeah I had tested it previously and that works... -Original Message- From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Friday, March 19, 2010 12:04 AM To: solr-user@lucene.apache.org Subject: Re: PDFBox/Tika Performance Issues Hi Giovanni, Let's tr

stream.url Contention

2010-03-18 Thread Giovanni Fernandez-Kincade
I recently switched from posting a file (PDFs in this case) to the Extract handler, to using the Stream.URL parameter. I've noticed a huge amount of contention around opening URL connections: http-8080-Processor36 [BLOCKED] CPU time: 0:47 sun.net.www.protocol.file.Handler.openConnection(URL) jav

RE: PDFBox/Tika Performance Issues

2010-03-17 Thread Giovanni Fernandez-Kincade
Re: PDFBox/Tika Performance Issues Hi Giovanni, Comments below: > I'm pretty unclear on how to patch the Tika 0.7-trunk on our Solr instance. > This is what I've tried so far (which was really just me guessing): > > > > 1. Got the latest version of the trunk code fr

RE: PDFBox/Tika Performance Issues

2010-03-16 Thread Giovanni Fernandez-Kincade
there were no errors logged as a result, but the PDF data does not appear to have been extracted (the field I used for map.content had an empty-string as a value). What's the right approach to perform this patch? -Original Message- From: Giovanni Fernandez-Kincade [mailto:

RE: PDFBox/Tika Performance Issues

2010-03-16 Thread Giovanni Fernandez-Kincade
efully next few weeks). Cheers, Chris [1] http://issues.apache.org/jira/browse/TIKA-380 [2] http://www.mail-archive.com/tika-u...@lucene.apache.org/msg00302.html On 3/16/10 2:31 PM, "Giovanni Fernandez-Kincade" wrote: Originally 16 (the number of CPUs on the machine), but even with 5

RE: PDFBox/Tika Performance Issues

2010-03-16 Thread Giovanni Fernandez-Kincade
ar 16, 2010, at 4:37 PM, Giovanni Fernandez-Kincade wrote: > I've been trying to bulk index about 11 million PDFs, and while profiling our > Solr instance, I noticed that all of the threads that are processing indexing > requests are constantly blocking each other during this c

PDFBox/Tika Performance Issues

2010-03-16 Thread Giovanni Fernandez-Kincade
I've been trying to bulk index about 11 million PDFs, and while profiling our Solr instance, I noticed that all of the threads that are processing indexing requests are constantly blocking each other during this call: http-8080-Processor39 [BLOCKED] CPU time: 9:35 java.util.Collections$Synchroni

Master Read Timeout

2010-01-25 Thread Giovanni Fernandez-Kincade
I have a slave that is pulling multiple cores from one master, and I'm very frequently seeing cases where the slave is getting timeouts when fetching from the master: 2010-01-25 11:00:22,819 [pool-3-thread-1] ERROR org.apache.solr.handler.SnapPuller - Master at: http://shredder:8080/solr/Filin

Cores + Replication Config

2010-01-11 Thread Giovanni Fernandez-Kincade
If you want to share one config amidst master & slaves, using Solr 1.4 replication, is there a way to specific whether a core is Master or Slave when using the CREATE Core command? Thanks, Gio.

RE: checkindex

2010-01-08 Thread Giovanni Fernandez-Kincade
p lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex -fix /path/to/solr/data/index/ hope that helps, -Ian On 1/8/10 2:09 PM, Giovanni Fernandez-Kincade wrote: > > I've seen many mentions of the Lucene CheckIndex tool, but where can I > find it? Is there any documentation on how to use

checkindex

2010-01-08 Thread Giovanni Fernandez-Kincade
I've seen many mentions of the Lucene CheckIndex tool, but where can I find it? Is there any documentation on how to use it? I noticed Luke has it built-in, but I can't get Luke to open my index with the "Don't open IndexReader(when opening corrupted index)" option check. Opening even an index

RE: replication --> missing field data file

2010-01-07 Thread Giovanni Fernandez-Kincade
up is just to take periodics backups not necessary for the Replicationhandler to work On Thu, Jan 7, 2010 at 2:37 AM, Giovanni Fernandez-Kincade wrote: > How can you tell when the backup is done? > > -Original Message- > From: noble.p...@gmail.com [mailto:noble.p...@gmail.c

RE: replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
in the name "index" others will be stored as index On Wed, Jan 6, 2010 at 10:31 PM, Giovanni Fernandez-Kincade wrote: > How can you differentiate between the backup and the normal index files? > > -Original Message- > From: noble.p...@gmail.com [mailto:noble.p..

RE: replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
eld data file On Wed, Jan 6, 2010 at 9:49 PM, Giovanni Fernandez-Kincade wrote: > I set up replication between 2 cores on one master and 2 cores on one slave. > Before doing this the master was working without issues, and I stopped all > indexing on the master. > > Now that repl

replication --> missing field data file

2010-01-06 Thread Giovanni Fernandez-Kincade
I set up replication between 2 cores on one master and 2 cores on one slave. Before doing this the master was working without issues, and I stopped all indexing on the master. Now that replication has synced the index files, an .FDT field is suddenly missing on both the master and the slave. Pr

Solr Replication Questions

2010-01-05 Thread Giovanni Fernandez-Kincade
http://wiki.apache.org/solr/SolrReplication I've been looking over this replication wiki and I'm still unclear on a two points about Solr Replication: 1. If there have been small changes to the index on the master, does the slave copy the entire contents of the index files that were affecte

RE: Solr Cell - PDFs plus literal metadata - GET or POST ?

2010-01-05 Thread Giovanni Fernandez-Kincade
Really? Doesn't it have to be delimited differently, if both the file contents and the document metadata will be part of the POST data? How does Solr Cell tell the difference between the literals and the start of the file? I've tried this before and haven't had any luck with it. -Original

RE:Delete, commit, optimize doesn't reduce index file size

2009-12-30 Thread Giovanni Fernandez-Kincade
Is there another way to make this happen without making further changes to the index? Maybe a bounce of the servlet server? On Tue, Dec 29, 2009 at 1:23 PM, markwaddle wrote: I have an index that used to have ~38M docs at 17.2GB. I deleted all but 13K docs using a delete by query, commit and t

RE: Unable to delete from index

2009-12-28 Thread Giovanni Fernandez-Kincade
config.xml Ankit -Original Message----- From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] Sent: Monday, December 28, 2009 5:46 PM To: solr-user@lucene.apache.org Subject: RE: Unable to delete from index Sorry - hit reply too early. I edited my config as you suggested

RE: Unable to delete from index

2009-12-28 Thread Giovanni Fernandez-Kincade
Sorry - hit reply too early. I edited my config as you suggested, rebooted Tomcat, and I can still find the doc through the Solr Admin interface even though I can't find it in Luke. -Original Message- From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com]

RE: Unable to delete from index

2009-12-28 Thread Giovanni Fernandez-Kincade
My HTTP caching is currently configured for Open Time So that shouldn't be the problem, right? -Original Message- From: AHMET ARSLAN [mailto:iori...@yahoo.com] Sent: Monday, December 28, 2009 5:31 PM To: solr-user@lucene.apache.org Subject: RE: Unable to delete from index > I opened

RE: Unable to delete from index

2009-12-28 Thread Giovanni Fernandez-Kincade
onday, December 28, 2009 4:54 PM To: 'solr-user@lucene.apache.org' Subject: RE: Unable to delete from index Are you deleting from correct index.[Meaning verify - Solr home] Also inspect thru luke to check the contents Ankit -Original Message- From: Giovanni Fe

Unable to delete from index

2009-12-28 Thread Giovanni Fernandez-Kincade
I'm having trouble performing deletes on a Solr 1.4 index. Whether I perform the deletes by query or by id, the document in question doesn't seem to get removed from the index. Even after a commit. I thought the problem might be the fact that I wasn't committing with expungeDeletes=true, but I'

Concurrent Merge Scheduler & MaxThread Count

2009-12-03 Thread Giovanni Fernandez-Kincade
I'm having trouble getting Solr to use more than one thread during index optimizations. I have the following in my solrconfig.xml: 6 I had the same problem some time ago, but upgrading to Solr 1.4 solved the problem. Now it's happening again, with Solr 1.4. No matter what I

RE: *:* Returning no results

2009-11-30 Thread Giovanni Fernandez-Kincade
: Monday, November 30, 2009 4:02 PM To: solr-user@lucene.apache.org Cc: Giovanni Fernandez-Kincade Subject: Re: *:* Returning no results Add debugQuery=on to give you clues. ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ On Nov 30, 2009, at 3:54 PM, Giovanni

*:* Returning no results

2009-11-30 Thread Giovanni Fernandez-Kincade
Hi, I created a brand new core (on Solr 1.4), added a few documents and then searched for *:*, but got no results. Strangely enough, if I search for a specific document I know is in the index, like say "versionId:3", I get the expected result. Any ideas on why that might be? Thank, Gio.

RE: Index Splitter

2009-11-25 Thread Giovanni Fernandez-Kincade
You can't really use this if you have an optimized index, right? -Original Message- From: Koji Sekiguchi [mailto:k...@r.email.ne.jp] Sent: Tuesday, November 24, 2009 6:57 PM To: solr-user@lucene.apache.org Subject: Re: Index Splitter Giovanni Fernandez-Kincade wrote: > Hi, >

Index Splitter

2009-11-24 Thread Giovanni Fernandez-Kincade
Hi, I've heard about a tool that can be used to split Lucene indexes, for cases where you want to break up a large index into shards. Do you know where I can find it? Any observations/recommendations about its use? This seems promising but I'm not sure if there is anything more mature out there

RE: Too Many Boolean Clauses

2009-10-22 Thread Giovanni Fernandez-Kincade
ginal Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Thursday, October 22, 2009 6:31 PM To: solr-user@lucene.apache.org Subject: Re: Too Many Boolean Clauses Giovanni Fernandez-Kincade wrote: > Hi, > I'm trying to perform a search against an integer field

Too Many Boolean Clauses

2009-10-22 Thread Giovanni Fernandez-Kincade
Hi, I'm trying to perform a search against an integer field with a ton of OR statements for each of the unique values that I want to search for. I pasted an example at the bottom of this email. Solr fires back the following error: org.apache.lucene.queryParser.ParseException: Cannot parse .. ': t

RE: Lucene Merge Threads

2009-10-14 Thread Giovanni Fernandez-Kincade
In case anyone is having the same problem, I finally got this working, using the nightly build link that Yonik sent around: http://people.apache.org/builds/lucene/solr/nightly/ Thanks, Gio. -Original Message- From: Giovanni Fernandez-Kincade Sent: Wednesday, October 14, 2009 2:10 PM To

RE: Lucene Merge Threads

2009-10-14 Thread Giovanni Fernandez-Kincade
Does anyone know the correct syntax to specify the maximum number of threads for the ConcurrentMergeScheduler? Also, is there any concrete way to know when the merge is actually complete (aside from profiling the machine)? Thanks, Gio. -Original Message- From: Giovanni Fernandez

RE: Lucene Merge Threads

2009-10-13 Thread Giovanni Fernandez-Kincade
ang.Class.$$YJP$$forName0(Native Method) at java.lang.Class.forName0(Unknown Source) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:294) ... 28 more -Original Message- From: Giovanni Fernande

RE: Lucene Merge Threads

2009-10-13 Thread Giovanni Fernandez-Kincade
Will do. Thanks! -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Tuesday, October 13, 2009 11:48 AM To: solr-user@lucene.apache.org Subject: Re: Lucene Merge Threads On Tue, Oct 13, 2009 at 8:19 PM, Giovanni Fernandez-Kincade < gfernandez-k

  1   2   >