Re: Solr hanging when extracting a some broken .doc files

2013-12-18 Thread Charlie Hull
On 17/12/2013 15:29, Augusto Camarotti wrote: Hi guys, I'm having a problem with solr when trying to index some broken .doc files. I have set up a test case using Solr to index all the files the users save on the shared directorys of the company that i work for and Solr is hanging when

Re: PostingsSolrHighlighter

2013-12-18 Thread Liu Bo
hi Josip for the 1 question we've done similar things: copying search field to a text field. But highlighting is normally on specific fields such as tittle depending on how the search content is displayed to the front end, you can search on text and highlight on the field you wanted by specify

Re: Solr hanging when extracting a some broken .doc files

2013-12-18 Thread Alexandre Rafalovitch
Charlie, Does it mean you are talking to it from a client program? Or are you running Tika in a listen/server mode and build some adapters for standard Solr processes? Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch -

Re: an array liked string is treated as multivalued when adding doc to solr

2013-12-18 Thread Liu Bo
Hi Alexandre It's quite a rare case, just one out of tens of thousands. I'm planning to have every multilingual field as multivalued and just get the first one while formatting the response to our business object. The first value update processor seems a lot helpful, thank you. All the best

DataImport Handler, writing a new EntityProcessor

2013-12-18 Thread Mathias Lux
Hi all! I've got a question regarding writing a new EntityProcessor, in the same sense as the Tika one. My EntityProcessor should analyze jpg images and create document fields to be used with the LIRE Solr plugin (https://bitbucket.org/dermotte/liresolr). Basically I've taken the same approach as

PeerSync Recovery fails, starting Replication Recovery

2013-12-18 Thread Anca Kopetz
Hi, In our SolrCloud cluster (2 shards, 8 replicas), the replicas go from time to time into recovering state, and it takes more than 10 minutes to finish to recover. In logs, we see that PeerSync Recovery fails with the message : PeerSync: core=fr_green url="" class="moz-txt-link-freetext"

Re: PostingsSolrHighlighter

2013-12-18 Thread Josip Delic
Am 18.12.2013 09:55, schrieb Liu Bo: hi Josip hi liu, for the 1 question we've done similar things: copying search field to a text field. But highlighting is normally on specific fields such as tittle depending on how the search content is displayed to the front end, you can search on text

Wildcard queries and custom char filter

2013-12-18 Thread michallos
Hello, I have a problem with configuring custom char filter. When there are no wildcards in query then my filter is invoked. When there are wildcards, my filter is not invoked. It is possible to configure charFilter to be used with wildcard queries? I can see than with wildcards,

Service Unavailable Error.

2013-12-18 Thread yriveiro
I having this error on my logs: ERROR - dat1 - 2013-12-18 11:40:11.704; org.apache.solr.update.StreamingSolrServers$1; error org.apache.solr.common.SolrException: Service Unavailable request:

No registered leader was found, but the UI says that I have.

2013-12-18 Thread yriveiro
I'm getting an error on Solr 4.6.0 about leader registation, the admin shows this: http://picpaste.com/a839446d0808df205aa7be78c780ed32.png But my logs says: ERROR - dat6 - 2013-12-18 11:43:54.253; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: No registered leader

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Jens Grivolla
You can do range queries without an upper bound and just limit the number of results. Then you look at the last result to obtain the new lower bound. -- Jens On 17/12/13 20:23, Petersen, Robert wrote: My use case is basically to do a dump of all contents of the index with no ordering

Re: Wildcard queries and custom char filter

2013-12-18 Thread Ahmet Arslan
Hi, Yes some factories implement org.apache.lucene.analysis.util.MultiTermAwareComponent Please see more http://wiki.apache.org/solr/MultitermQueryAnalysis On Wednesday, December 18, 2013 1:05 PM, michallos michal.ware...@gmail.com wrote: Hello, I have a problem with configuring custom

Re: Wildcard queries and custom char filter

2013-12-18 Thread michallos
It works! Thanks. Last question: how to invoke charFilter before tokenizer? I can see that with tokenizer StandardTokenizerFactory without wildcards text 123-abc is broken into two tokens 123 and abc but text *123-abc* remain unchanged *123-abc*. It is possible to use charFilter before

RE: DataImport Handler, writing a new EntityProcessor

2013-12-18 Thread Dyer, James
The first thing I would suggest is to try and run it not in debug mode. DIH's debug mode limits the number of documents it will take in, so that might be all that is wrong here. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: mathias@gmail.com

solrcloud no server hosting shard

2013-12-18 Thread gf80
Hi guys, before starting note that I am new with solr and in particular with solrcloud. I have to index many many documents (10mln), last week I have complete my import handler and configuration so I have started import activity on solr using solrcloud with 10 shard (and without replicas :S ) on

Re: DataImport Handler, writing a new EntityProcessor

2013-12-18 Thread Mathias Lux
Unfortunately it is the same in non-debug, just the first document. I also output the params to sout, but it seems only the first one is ever arriving at my custom class. I've the feeling that I'm doing something seriously wrong here, based on a complete misunderstanding :) I basically assume that

Dynamically deriving the param value in solrconfig requestHandler

2013-12-18 Thread Senthilnathan Vijayaraja
hi, Is there any possibility to derive a value to a param from other params like below, requestHandler name=/main class=com.solr.custom.handler.MySearchHandler arr name=components strquery/str strdebug/str /arr lst name=defaults str name=size_relaxed*size:['$minSize' TO

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Mikhail Khludnev
Aha! SOLR-5244 is a particular case which I'm asking about. I wonder who else consider it useful? (I.m sorry if I hijacked the thread) 18.12.2013 5:41 пользователь Joel Bernstein joels...@gmail.com написал: They are for different use cases. Hoss's approach, I believe, focuses on deep paging of

RE: Solr failure results in misreplication?

2013-12-18 Thread Tim Potter
Any chance you still have the logs from the servers hosting 1 2? I would open a JIRA ticket for this one as it sounds like something went terribly wrong on restart. You can update the /clusterstate.json to fix this situation. Lastly, it's recommended to use an OOM killer script with

Re: Wildcard queries and custom char filter

2013-12-18 Thread michallos
Hoh, I can see that when there are wildcards then KeywordTokenizerFactory is used instead of StandardTokenizerFactory. I created custom wildcard remover char filter for few specific cases (so I cannot use any of regex replacer filters) but event with that, KeywordTokenizerFactory is used. I

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Chris Hostetter
: : What about SELECT * FROM WHERE ... like misusing Solr? I'm sure you've been : asked many times for that. : What if client don't need to rank results somehow, but just requesting : unordered filtering result like they are used to in RDBMS? : Do you feel it will never considered as a resonable

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Chris Hostetter
: You can do range queries without an upper bound and just limit the number of : results. Then you look at the last result to obtain the new lower bound. exactly. instead of this: First: q=foostart=0rows=$ROWS After: q=foostart=$Xrows=$ROWS ...where $ROWS is how big a batch of docsy you

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Michael Della Bitta
Us too. That's going to be huge for us! Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. The Science of Influence Marketing 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+:

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Jonathan Rochkind
On 12/17/13 1:16 PM, Chris Hostetter wrote: As i mentioned in the blog above, as long as you have a uniqueKey field that supports range queries, bulk exporting of all documents is fairly trivial by sorting on your uniqueKey field and using an fq that also filters on your uniqueKey field modify

Re: Solr3.4 on tomcat 7.0.23 - hung with error threw exception java.lang.IllegalStateException: Cannot call sendError() after the response has been committed

2013-12-18 Thread solr-user
were you able to resolve this issue, and if so how?? I am encountering the same issue in a couple of solr versions (including 4.0 and 4.5) -- View this message in context:

org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.

2013-12-18 Thread neerajp
Hi, I am using ExtractingRequestHandler to extract text from binary data and then index the text but getting *error: org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.* *solrconfig.xml:* requestHandler name=/update/extract

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Chris Hostetter
: One question that I was never sure about when trying to do things like this -- : is this going to end up blowing the query and/or document caches if used on a : live Solr? By filling up those caches with the results of the 'bulk' export? : If so, is there any way to avoid that? Or does it

Shards stuck in down state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-18 Thread cwhi
I called SPLITSHARD on a shard in an existing SolrCloud instance, where the shard had ~1 million documents in it. It's been about 3 hours since that splitting has completed, and the subshards are still stuck in a Down state. They are reported as down in localhost/solr/#/~cloud, and I'm unable to

Re: DataImport Handler, writing a new EntityProcessor

2013-12-18 Thread P Williams
Hi Mathias, I'd recommend testing one thing at a time. See if you can get it to work for one image before you try a directory of images. Also try testing using the solr-testframework using your ide (I use Eclipse) to debug rather than your browser/print statements. Hopefully that will give you

Re: solr as nosql - pulling all docs vs deep paging limitations

2013-12-18 Thread Mikhail Khludnev
On Wed, Dec 18, 2013 at 8:03 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : : What about SELECT * FROM WHERE ... like misusing Solr? I'm sure you've been : asked many times for that. : What if client don't need to rank results somehow, but just requesting : unordered filtering result

Solr could replace shards

2013-12-18 Thread Max Hansmire
I am considering using SolrCloud, but I have a use case that I am not sure if it covers. I would like to keep an index up to date in realtime, but also I would like to sometimes restate the past. The way that I would restate the past is to do batch processing over historical data. My idea is

Re: Shards stuck in down state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-18 Thread Anshum Gupta
Hi, Is the parent shard currently active? What does the clusterstate.json say? The subshard could be stuck in down when it's trying to recover but as far as I remember, the sub-shards only get marked active (and the parent goes inactive) once the recovery and replication (for as many replicas as

Re: solrcloud no server hosting shard

2013-12-18 Thread Furkan KAMACI
Hi Guiseppe; First of all you should give us the full error log to understand the reason behind the error. On the other hand it is not a must to have extra replicas for your shards but you really should consider to have replicas. When you start up a new Solr instance it will be assigned to one of

Re: No registered leader was found, but the UI says that I have.

2013-12-18 Thread Furkan KAMACI
Hi; Do you have any error log for leader election? Also do you have this error always or just within the time period of while the other replica is recovery mode? Thanks; Furkan KAMACI 18 Aralık 2013 Çarşamba tarihinde yriveiro yago.rive...@gmail.com adlı kullanıcı şöyle yazdı: I'm getting an

Re: PeerSync Recovery fails, starting Replication Recovery

2013-12-18 Thread Furkan KAMACI
Hi Anca; Could you check the conversation at here: http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-td4061831.html Thanks; Furkan KAMACI 18 Aralık 2013 Çarşamba tarihinde Anca Kopetz anca.kop...@kelkoo.com adlı kullanıcı şöyle yazdı: Hi, In our

Solr 4.5 - Solr Cloud is creating new cores on random nodes

2013-12-18 Thread Ryan Wilson
Hello all, I am currently in the process of building out a solr cloud with solr 4.5 on 4 nodes with some pretty hefty hardware. When we create the collection we have a replication factor of 2 and store 2 replicas per node. While we have been experimenting, which has involved bringing nodes up

email datasource connect timeout issue

2013-12-18 Thread xie kidd
Hi all, When i try to set up a email data source as http://wiki.apache.org/solr/MailEntityProcessor , connect timeout Exception happened. i am sure the user and password is correct, and the rss data source also work well. anyone can do me a favior? This issue base on solr4.5 with tomcat7,

Re: Solr-839 and version 4.5 (XmlQueryParser)

2013-12-18 Thread Puneet Pawaia
Hi, Just in case it is of use to anyone, I managed to compile the 4.0 patch by changing the line where new CoreParser is created to below. CoreParser parser = new CoreParser(defaultField, getReq().getSchema().getQueryAnalyzer()); The parser seems to work for the simple tests that I have done so

Re: PostingsSolrHighlighter

2013-12-18 Thread Liu Bo
Hi Josip that's quite weird, to my experience highlight is strict on string field which needs a exact match, text fields should be fine. I copy your schema definition and do a quick test in a new core, everything is default from the tutorial, and the search component is using

Concurrent request configurations for Solr Processors

2013-12-18 Thread Dileepa Jayakody
Hi All, I have written a custom update request processor and configured a UpdateRequestProcessor chain in solrconfig.xml as below; updateRequestProcessorChain name=stanbolInterceptor processor class= *com.solr.stanbol.processor.StanbolContentProcessorFactory* / processor