Re: "What is Solr" in Google search results

2017-08-30 Thread Rick Leir
Vincenzo, 
This is a discussion for the wikipedia 'talk' page. My sense is that 
information must be verifiable, and that the popularity rating at db-engines is 
not transparent. Would you like to start the discussion? Cheers -- Rick

On August 30, 2017 5:17:25 PM MDT, Vincenzo D'Amore  wrote:
>Hi All,
>
>googling for "what is Solr" I found this as *first* sentence:
>
>"Solr is the second-most popular enterprise search engine after
>Elasticsearch. ... "
>
>The description comes from wikipedia https://en.
>wikipedia.org/wiki/Apache_Solr
>
>Now, well, I'm a little upset, because I think this is a misleading
>description, this answer does not really... well, answer the question.
>
>And even... because Solr is not the first most popular :)))
>
>Ok, seriously, the first sentence (or the answer at all) should not
>define
>the position of the search engine in a list, in a kind of competition
>where
>Solr has the second place.
>If it is the first, the second or whatever most popular is not the
>right
>answer.
>
>So I want inform the community and search for an advice, if any, how to
>have a better description in the Google results page.
>
>If you have any comments or questions, please let me know.
>
>Best regards,
>Vincenzo
>
>
>-- 
>Vincenzo D'Amore
>email: v.dam...@gmail.com
>skype: free.dev
>mobile: +39 349 8513251 <349%20851%203251>

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Indexed=false for a field,but still able to search on field.

2017-08-30 Thread Rick Leir
Ashish,
Fast search depends on indexing the data. If it is not indexed, then the search 
becomes a full table scan which is much slower. Cheers -- Rick

On August 29, 2017 11:57:44 AM MDT, AshB  wrote:
>Hi,
>
>Thanks ,got this issue is happening because of docValues=true.
>
>Please elaborate on "full table scan search"
>
>Regards
>Ashish
>
>
>
>--
>View this message in context:
>http://lucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338p4352599.html
>Sent from the Solr - User mailing list archive at Nabble.com.

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Different ideas for querying unique and non-unique records

2017-08-30 Thread Rick Leir
Susheel, Just a guess, but carrot2.org might be useful. But it might be 
overkill. Cheers -- Rick

On August 30, 2017 7:40:08 AM MDT, Susheel Kumar  wrote:
>Hello,
>
>I am looking for different ideas/suggestions to solve the use case am
>working on.
>
>We have couple of fields in schema along with id, business_email and
>personal_email.  We need to return all records based on unique business
>and
>personal email's.
>
>The criteria for unique records is either of business or personal email
>has
>not repeated again in other records.
>The criteria for non-unique records is if any of the business or
>personal
>email has occurred/repeats in other records then all those records are
>non-unique.
>E.g considering below documents.
>- for unique records below only id=1 should be returned (since john.doe
>is
>not present in any other records personal or business email)
>- non unique records, below id=2,3 should be returned (since
>isabel.dora is
>present in multiple records. doesn't matter if it is present in
>business or
>personal email)
>
>Documents
>===
>{id:1,business_email_s:john@abc.com,personal_email_s:john@abc.com}
>{id:2,business_email_s:isabel.d...@abc.com}
>{id:3,personal_email_s:isabel.d...@abc.com}
>
>I am able to solve this using Streaming expression query but not sure
>if
>performance will become an bottleneck as the streaming expression is
>quite
>big. So looking for
>different ideas like using de-dupe or during ingestion/pre-process etc.
>without impacting performance much.
>
>Thanks,
>Susheel

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Solr Reindex Issue - Can't able to Reindex Old Data

2017-08-30 Thread @Nandan@
Hi ,

I am using Apache Solr with Cassandra Database. In my table, I have 20
rows. Due to some changes, I changed my Solr schema and Reindex schema with
below option as

*reindex=true and deleteAll=false*

After Reindexing my Solr Schema, I am not able to do reindex my old data
which are present in my table before. I am only able to retrieve newly
added data which is done after reindexing.

Please help in this issue.

Thanks


Re: "What is Solr" in Google search results

2017-08-30 Thread Doug Turnbull
I question the accuracy of that "Second most popular" on a couple of fronts:

Maybe it's the most popular! -- I speak at Elasticsearch meetups. It's 90%
logs logs logs, with some search thrown in. Solr meetups have a tremendous
amount of information retrieval. Giving a information retrieval talk at
Elasticsearch meetups sometimes gets blank stares (though in many cases
not).
Maybe it's less popular! -- Is DB Engines really a scientific source here?
Maybe MySQL LIKE statements is still the most popular enterprise search
engine :-p

-Doug

On Wed, Aug 30, 2017 at 8:52 PM Leonardo Perez Pulido <
leoperezpul...@gmail.com> wrote:

> Hi,
>
> I think there are many things to consider besides the 'normal' search you
> did:
>
> - First of all, Google search results vary. The search algorithm of google
> changes all the time.
> - Many different elements determine 'what' google scores as top docs in
> search results, among them:
> - The device you are searching on.
> - Your search history.
> - If you are logged into the google account.
> - Your geographical location.
> - The type of search you are doing, whether it is a term/keyword or a
> phrase search.
> - And if it is summer, or winter (joking I don't know nobody knows with
> google).
>
> For example, the same search phrase from my location returns a very
> different result as top doc:
>
> Apache *Solr* is an open source search platform built upon a Java library
> called Lucene.
>
> Which really is a definition of what Solr is.
>
> So, in conclusion, if you want a better search engine than that: use Solr.
> :)
>
> On Wed, Aug 30, 2017 at 7:17 PM, Vincenzo D'Amore 
> wrote:
>
> > Hi All,
> >
> > googling for "what is Solr" I found this as *first* sentence:
> >
> > "Solr is the second-most popular enterprise search engine after
> > Elasticsearch. ... "
> >
> > The description comes from wikipedia https://en.
> > wikipedia.org/wiki/Apache_Solr
> >
> > Now, well, I'm a little upset, because I think this is a misleading
> > description, this answer does not really... well, answer the question.
> >
> > And even... because Solr is not the first most popular :)))
> >
> > Ok, seriously, the first sentence (or the answer at all) should not
> define
> > the position of the search engine in a list, in a kind of competition
> where
> > Solr has the second place.
> > If it is the first, the second or whatever most popular is not the right
> > answer.
> >
> > So I want inform the community and search for an advice, if any, how to
> > have a better description in the Google results page.
> >
> > If you have any comments or questions, please let me know.
> >
> > Best regards,
> > Vincenzo
> >
> >
> > --
> > Vincenzo D'Amore
> > email: v.dam...@gmail.com
> > skype: free.dev
> > mobile: +39 349 8513251 <+39%20349%20851%203251> <349%20851%203251>
> >
>
-- 
Consultant, OpenSource Connections. Contact info at
http://o19s.com/about-us/doug-turnbull/; Free/Busy (http://bit.ly/dougs_cal)


Re: "What is Solr" in Google search results

2017-08-30 Thread Leonardo Perez Pulido
Hi,

I think there are many things to consider besides the 'normal' search you
did:

- First of all, Google search results vary. The search algorithm of google
changes all the time.
- Many different elements determine 'what' google scores as top docs in
search results, among them:
- The device you are searching on.
- Your search history.
- If you are logged into the google account.
- Your geographical location.
- The type of search you are doing, whether it is a term/keyword or a
phrase search.
- And if it is summer, or winter (joking I don't know nobody knows with
google).

For example, the same search phrase from my location returns a very
different result as top doc:

Apache *Solr* is an open source search platform built upon a Java library
called Lucene.

Which really is a definition of what Solr is.

So, in conclusion, if you want a better search engine than that: use Solr.
:)

On Wed, Aug 30, 2017 at 7:17 PM, Vincenzo D'Amore 
wrote:

> Hi All,
>
> googling for "what is Solr" I found this as *first* sentence:
>
> "Solr is the second-most popular enterprise search engine after
> Elasticsearch. ... "
>
> The description comes from wikipedia https://en.
> wikipedia.org/wiki/Apache_Solr
>
> Now, well, I'm a little upset, because I think this is a misleading
> description, this answer does not really... well, answer the question.
>
> And even... because Solr is not the first most popular :)))
>
> Ok, seriously, the first sentence (or the answer at all) should not define
> the position of the search engine in a list, in a kind of competition where
> Solr has the second place.
> If it is the first, the second or whatever most popular is not the right
> answer.
>
> So I want inform the community and search for an advice, if any, how to
> have a better description in the Google results page.
>
> If you have any comments or questions, please let me know.
>
> Best regards,
> Vincenzo
>
>
> --
> Vincenzo D'Amore
> email: v.dam...@gmail.com
> skype: free.dev
> mobile: +39 349 8513251 <349%20851%203251>
>


"What is Solr" in Google search results

2017-08-30 Thread Vincenzo D'Amore
Hi All,

googling for "what is Solr" I found this as *first* sentence:

"Solr is the second-most popular enterprise search engine after
Elasticsearch. ... "

The description comes from wikipedia https://en.
wikipedia.org/wiki/Apache_Solr

Now, well, I'm a little upset, because I think this is a misleading
description, this answer does not really... well, answer the question.

And even... because Solr is not the first most popular :)))

Ok, seriously, the first sentence (or the answer at all) should not define
the position of the search engine in a list, in a kind of competition where
Solr has the second place.
If it is the first, the second or whatever most popular is not the right
answer.

So I want inform the community and search for an advice, if any, how to
have a better description in the Google results page.

If you have any comments or questions, please let me know.

Best regards,
Vincenzo


-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251 <349%20851%203251>


Error opening new searcher due to LockObtainFailedException

2017-08-30 Thread Sundeep T
Hello,

Occasionally we are seeing errors opening new searcher for certain solr
cores. Whenever this happens, we are unable to query or ingest new data
into these cores. It seems to clear up after some time though. The root
cause seems to be - *"org.apache.lucene.store.LockObtainFailedException:
Lock held by this virtual machine:
/opt/solr/volumes/data9/7d50b38e114af075-core-24/data/index/write.lock"*

Below is the full stack trace. Any ideas on what could be going on that
causes such an exception and how to mitigate this? thanks a lot for your
help!

Unable to create core
[7d50b38e114af075-core-24],trace=org.apache.solr.common.SolrException:
Unable to create core [7d50b38e114af075-core-24]
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:903)
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1167)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:252)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:418)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.(SolrCore.java:952)
at org.apache.solr.core.SolrCore.(SolrCore.java:816)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:890)
... 30 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1891)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2011)
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1041)
at org.apache.solr.core.SolrCore.(SolrCore.java:925)
... 32 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by
this virtual machine:
/opt/solr/volumes/data9/7d50b38e114af075-core-24/data/index/write.lock
at
org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:127)
at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41)
at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45)
at
org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:104)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:804)
at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:125)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100)
at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:240)
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:114)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1852)
... 35 more
,code=500}```


Knn classifier doesn't work

2017-08-30 Thread Adriano
Hello,

I'm trying to use the knn classifier by following this link:
https://wiki.apache.org/solr/SolrClassification

I use this config :



  classification

  

and 

  

Title,Body
Tags
Tags
knn
20
1
5




  

For the schema.xml :

  
  
  

There's no error, even in the log.

It's like the updateRequestProcessorChain is not called. So I try with the
bayes and this error occurs :

java.lang.NullPointerException
at
org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:116)
at
org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getClasses(SimpleNaiveBayesDocumentClassifier.java:106)
at
org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:107)
at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at

Re: Bug in Solr 6.6.0? "Cannot change DocValues type from SORTED_SET to SORTED"

2017-08-30 Thread Erick Erickson
P.S. Perhaps the defaults changed when you upgraded for some reason?

Erick

On Wed, Aug 30, 2017 at 11:15 AM, Erick Erickson
 wrote:
> This usually means you changed multiValued from true to false or vice
> versa then added more docs.
>
> So since each segment is its own "mini index", different segments have
> different expectations and when you query this error is thrown.
>
> Most of the time when you change a field's type in the schema you have
> to re-index from scratch. And I'd delete *:* first (or just use a new
> collection and alias).
>
> Best,
> Erick
>
> On Wed, Aug 30, 2017 at 10:04 AM, Stephan Schubert
>  wrote:
>> After I tried an update from Solr 6.5.0 to Solr 6.6.0 (SolrCloud mode), I
>> receive in one collection the following error:
>>
>> "Cannot change DocValues type from SORTED_SET to SORTED for field
>> "index_todelete".
>>
>> I had a look on the index values (if set all are true or not filled,
>> checked via faceting in the working instance) and I can't see anything
>> special issues on this field. In the case I move back to Solr 6.5.0 the
>> Solr collection is coming up normal with the same set of index data. So I
>> assume there was any change in 6.6.0 but couldn't find anything in the
>> release notes nor in any known issues in JIRA.
>>
>> Does anyone have an idea what's going on here? The field even doesn't have
>> docValues set or multivalued, so I don't understand the error message
>> here.
>>
>> Configuration in schema.xml:
>> > stored="true" type="boolean"/>
>>
>>
>> Error Log:
>> java.util.concurrent.ExecutionException:
>> org.apache.solr.common.SolrException: Unable to create core
>> [GLOBAL-Fileshares-Index_shard1_replica2]
>>  at
>> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>  at
>> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>>  at
>> org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:586)
>>  at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>  at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>  at
>> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>  at
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>  at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.solr.common.SolrException: Unable to create core
>> [GLOBAL-Fileshares-Index_shard1_replica2]
>>  at
>> org.apache.solr.core.CoreContainer.create(CoreContainer.java:935)
>>  at
>> org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:558)
>>  at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
>>  ... 5 more
>> Caused by: org.apache.solr.common.SolrException: Error opening new
>> searcher
>>  at
>> org.apache.solr.core.SolrCore.(SolrCore.java:977)
>>  at
>> org.apache.solr.core.SolrCore.(SolrCore.java:830)
>>  at
>> org.apache.solr.core.CoreContainer.create(CoreContainer.java:920)
>>  ... 7 more
>> Caused by: org.apache.solr.common.SolrException: Error opening new
>> searcher
>>  at
>> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069)
>>  at
>> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189)
>>  at
>> org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1071)
>>  at
>> org.apache.solr.core.SolrCore.(SolrCore.java:949)
>>  ... 9 more
>> Caused by: java.lang.IllegalArgumentException: cannot change DocValues
>> type from SORTED_SET to SORTED for field "index_todelete"
>>  at
>> org.apache.lucene.index.FieldInfo.setDocValuesType(FieldInfo.java:212)
>>  at
>> org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:430)
>>  at
>> org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:438)
>>  at
>> org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:375)
>>  at
>> org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:245)
>>  at
>> org.apache.solr.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:266)
>>  at
>> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:281)
>>  at
>> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2037)
>>  ... 12 

Re: Solr client

2017-08-30 Thread Leonardo Perez Pulido
Hi,
Apart from take a look at the Solr's wiki, I think one of the main reasons
why these API's are all out dated is that Solr itself provides the 'API' to
many different languages in the form of output formats.

Maybe you know that the main protocol used in Solr for communication with
its clients is HTTP. Many (if not all) of today's programming languages
provides a mean to send request to Solr via HTTP. And Solr 'responses' to
every one of those languages via different available response formats.

By default there are response formats for: JavaScript, Python, Ruby, and
SolrJ (Java). All that response formats are first-class citizens in Solr.

Have a look:
http://wiki.apache.org/solr/IntegratingSolr
https://lucene.apache.org/solr/guide/6_6/client-apis.html

Regards.

On Wed, Aug 30, 2017 at 1:59 PM, Alexandre Rafalovitch 
wrote:

> We do have a page on the Wiki with a lot of that information.
>
> Did you see it?
>
> Regards,
> Alex
>
>
> On 29 Aug. 2017 2:28 am, "Aditya"  wrote:
>
> Hi
>
> I am aggregating open source solr client libraries across all languages.
> Below are the links. Very few projects are currently active. Most of them
> are last updated few years back. Please provide me pointers, if i missed
> any solr client library.
>
> http://www.findbestopensource.com/tagged/solr-client
> http://www.findbestopensource.com/tagged/solr-gui
>
>
> Regards
> Ganesh
>
> PS: The website http://www.findbestopensource.com search is powered by
> Solr.
>


Re: Bug in Solr 6.6.0? "Cannot change DocValues type from SORTED_SET to SORTED"

2017-08-30 Thread Erick Erickson
This usually means you changed multiValued from true to false or vice
versa then added more docs.

So since each segment is its own "mini index", different segments have
different expectations and when you query this error is thrown.

Most of the time when you change a field's type in the schema you have
to re-index from scratch. And I'd delete *:* first (or just use a new
collection and alias).

Best,
Erick

On Wed, Aug 30, 2017 at 10:04 AM, Stephan Schubert
 wrote:
> After I tried an update from Solr 6.5.0 to Solr 6.6.0 (SolrCloud mode), I
> receive in one collection the following error:
>
> "Cannot change DocValues type from SORTED_SET to SORTED for field
> "index_todelete".
>
> I had a look on the index values (if set all are true or not filled,
> checked via faceting in the working instance) and I can't see anything
> special issues on this field. In the case I move back to Solr 6.5.0 the
> Solr collection is coming up normal with the same set of index data. So I
> assume there was any change in 6.6.0 but couldn't find anything in the
> release notes nor in any known issues in JIRA.
>
> Does anyone have an idea what's going on here? The field even doesn't have
> docValues set or multivalued, so I don't understand the error message
> here.
>
> Configuration in schema.xml:
>  stored="true" type="boolean"/>
>
>
> Error Log:
> java.util.concurrent.ExecutionException:
> org.apache.solr.common.SolrException: Unable to create core
> [GLOBAL-Fileshares-Index_shard1_replica2]
>  at
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at
> org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:586)
>  at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>  at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.common.SolrException: Unable to create core
> [GLOBAL-Fileshares-Index_shard1_replica2]
>  at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:935)
>  at
> org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:558)
>  at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
>  ... 5 more
> Caused by: org.apache.solr.common.SolrException: Error opening new
> searcher
>  at
> org.apache.solr.core.SolrCore.(SolrCore.java:977)
>  at
> org.apache.solr.core.SolrCore.(SolrCore.java:830)
>  at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:920)
>  ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new
> searcher
>  at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069)
>  at
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189)
>  at
> org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1071)
>  at
> org.apache.solr.core.SolrCore.(SolrCore.java:949)
>  ... 9 more
> Caused by: java.lang.IllegalArgumentException: cannot change DocValues
> type from SORTED_SET to SORTED for field "index_todelete"
>  at
> org.apache.lucene.index.FieldInfo.setDocValuesType(FieldInfo.java:212)
>  at
> org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:430)
>  at
> org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:438)
>  at
> org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:375)
>  at
> org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:245)
>  at
> org.apache.solr.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:266)
>  at
> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:281)
>  at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2037)
>  ... 12 more


Re: Solr client

2017-08-30 Thread Alexandre Rafalovitch
We do have a page on the Wiki with a lot of that information.

Did you see it?

Regards,
Alex


On 29 Aug. 2017 2:28 am, "Aditya"  wrote:

Hi

I am aggregating open source solr client libraries across all languages.
Below are the links. Very few projects are currently active. Most of them
are last updated few years back. Please provide me pointers, if i missed
any solr client library.

http://www.findbestopensource.com/tagged/solr-client
http://www.findbestopensource.com/tagged/solr-gui


Regards
Ganesh

PS: The website http://www.findbestopensource.com search is powered by Solr.


Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Susheel Kumar
1) As regards naming of the shards: Is using the same naming for the shards
o.k. in this constellation? I.e. does it create trouble to have e.g.
"Shard001", "Shard002", etc. in collection1 and "Shard001", "Shard002",
etc. as well in collection2?
>> The default naming convention for shards would be
"_shard#_replica#".  So complete name will be different
like coll1_shard1_replica1 and coll2_shard1_replica1

2) Performance: In my current single collection setup, I have 2 shards per
node. After creating the second collection, there will be 4 shards per
node. Do I have to edit the RAM per node value (raise the -m parameter when
starting the node)? In my case, I am quite sure that the collections will
never be queried simultaneously. So will the "running but idle" collection
slow me down?
>> Its up to you how you setup JVM.  You can have one JVM instance running
on port assume 8080 and have multiple shards/collections or you can setup
two JVM/solr instances on a node running on different ports like 8080 and
8081 etc. I would suggest to start and test with one JVM and setup multiple
collections until run into performance bottleneck and then split into JVM
with different heaps etc.



On Wed, Aug 30, 2017 at 12:42 PM, Johannes Knaus  wrote:

> Thank you, Susheel, for the quick response.
>
> So, that means that when I create a new collection, it shards will be
> newly created at each node, right?
> Thus, if I have two collections with
> numShards=38,
> maxShardsPerNode=2 and
> replicationFactor=2
> on my 38 nodes, then this would result in each node "hosting" 4 shards
> (two from each collection).
>
> If this is correct, I have two follow up questions:
>
> 1) As regards naming of the shards: Is using the same naming for the
> shards o.k. in this constellation? I.e. does it create trouble to have e.g.
> "Shard001", "Shard002", etc. in collection1 and "Shard001", "Shard002",
> etc. as well in collection2?
>
> 2) Performance: In my current single collection setup, I have 2 shards per
> node. After creating the second collection, there will be 4 shards per
> node. Do I have to edit the RAM per node value (raise the -m parameter when
> starting the node)? In my case, I am quite sure that the collections will
> never be queried simultaneously. So will the "running but idle" collection
> slow me down?
>
> Johannes
>
> -Ursprüngliche Nachricht-
> Von: Susheel Kumar [mailto:susheel2...@gmail.com]
> Gesendet: Mittwoch, 30. August 2017 17:36
> An: solr-user@lucene.apache.org
> Betreff: Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the
> same nodes possible?
>
> Yes, absolutely.  You can create as many as collections you need (like you
> would create table in relational world).
>
> On Wed, Aug 30, 2017 at 10:13 AM, Johannes Knaus 
> wrote:
>
> > I have a working SolrCloud-Setup with 38 nodes with a collection
> > spanning over these nodes with 2 shards per node and replication
> > factor 2 and a router field.
> >
> > Now I got some new data for indexing which has the same structure and
> > size as my existing index in the described collection.
> > However, although it has the same structure the new data to be indexed
> > should not be mixed with the old data.
> >
> > Do I have create another 38 new nodes and a new collection and index
> > the new data or is there a better / more efficient way I could use the
> > existing nodes?
> > Is it possible that the 2 collections could share the 38 nodes without
> > the indexes being mixed?
> >
> > Thanks for your help.
> >
> > Johannes
> >
>


Bug in Solr 6.6.0? "Cannot change DocValues type from SORTED_SET to SORTED"

2017-08-30 Thread Stephan Schubert
After I tried an update from Solr 6.5.0 to Solr 6.6.0 (SolrCloud mode), I 
receive in one collection the following error:

"Cannot change DocValues type from SORTED_SET to SORTED for field 
"index_todelete". 

I had a look on the index values (if set all are true or not filled, 
checked via faceting in the working instance) and I can't see anything 
special issues on this field. In the case I move back to Solr 6.5.0 the 
Solr collection is coming up normal with the same set of index data. So I 
assume there was any change in 6.6.0 but couldn't find anything in the 
release notes nor in any known issues in JIRA.

Does anyone have an idea what's going on here? The field even doesn't have 
docValues set or multivalued, so I don't understand the error message 
here.

Configuration in schema.xml:



Error Log:
java.util.concurrent.ExecutionException: 
org.apache.solr.common.SolrException: Unable to create core 
[GLOBAL-Fileshares-Index_shard1_replica2]
 at 
java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at 
java.util.concurrent.FutureTask.get(FutureTask.java:192)
 at 
org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:586)
 at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Unable to create core 
[GLOBAL-Fileshares-Index_shard1_replica2]
 at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:935)
 at 
org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:558)
 at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
 ... 5 more
Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:977)
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:830)
 at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:920)
 ... 7 more
Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher
 at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069)
 at 
org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189)
 at 
org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1071)
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:949)
 ... 9 more
Caused by: java.lang.IllegalArgumentException: cannot change DocValues 
type from SORTED_SET to SORTED for field "index_todelete"
 at 
org.apache.lucene.index.FieldInfo.setDocValuesType(FieldInfo.java:212)
 at 
org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:430)
 at 
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:438)
 at 
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:375)
 at 
org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:245)
 at 
org.apache.solr.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:266)
 at 
org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:281)
 at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2037)
 ... 12 more

Bug in Solr 6.6.0? Cannot change DocValues type from SORTED_SET to SORTED

2017-08-30 Thread Stephan Schubert
After I tried an update from Solr 6.5.0 to Solr 6.6.0 (SolrCloud mode), I 
receive in one collection the following error:

"Cannot change DocValues type from SORTED_SET to SORTED for field 
"index_todelete". 

I had a look on the index values (if set all are true or not filled, 
checked via faceting in the working instance) and I can't see anything 
special issues on this field. In the case I move back to Solr 6.5.0 the 
Solr collection is coming up normal with the same set of index data. So I 
assume there was any change in 6.6.0 but couldn't find anything in the 
release notes nor in any known issues in JIRA.

Does anyone have an idea what's going on here? The field even doesn't have 
docValues set or multivalued, so I don't understand the error message 
here.

Configuration in schema.xml:



Error Log:
java.util.concurrent.ExecutionException: 
org.apache.solr.common.SolrException: Unable to create core 
[GLOBAL-Fileshares-Index_shard1_replica2]
 at 
java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at 
java.util.concurrent.FutureTask.get(FutureTask.java:192)
 at 
org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:586)
 at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Unable to create core 
[GLOBAL-Fileshares-Index_shard1_replica2]
 at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:935)
 at 
org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:558)
 at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
 ... 5 more
Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:977)
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:830)
 at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:920)
 ... 7 more
Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher
 at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069)
 at 
org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189)
 at 
org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1071)
 at 
org.apache.solr.core.SolrCore.(SolrCore.java:949)
 ... 9 more
Caused by: java.lang.IllegalArgumentException: cannot change DocValues 
type from SORTED_SET to SORTED for field "index_todelete"
 at 
org.apache.lucene.index.FieldInfo.setDocValuesType(FieldInfo.java:212)
 at 
org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:430)
 at 
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:438)
 at 
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:375)
 at 
org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:245)
 at 
org.apache.solr.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:266)
 at 
org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:281)
 at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2037)
 ... 12 more

Re: cwiki has problems ?

2017-08-30 Thread Cassandra Targett
The thing everyone should be aware of is that those strings you see
aren't just strange styles, they are actually lost code blocks - many
of the code examples throughout the old Ref Guide are now missing
(some, for some reason, aren't affected). IOW, if you use the old Ref
Guide, you will be missing critical information.

As Erick mentioned we're working on an automatic redirect from old to
new, but it's also in the Infra group's queue since they manage that
application.

Cassandra

On Wed, Aug 30, 2017 at 10:58 AM, Erick Erickson
 wrote:
> This has happened to several projects, so it's something
> infrastructure related not specific to Solr's CWiki. We've raised a
> ticket for infra to see fi they can find the root cause.
>
> Cassandra and Hoss are trying to address the whole
> CWiki-no-longer-current issue.
>
> BTW, I find it useful to download the PDF (upper left corner) for
> whatever version you want and search that locally. I only have 16
> separate ones on my machine ;)
>
> Best,
> Erick
>
> On Wed, Aug 30, 2017 at 8:18 AM, Susheel Kumar  wrote:
>> Now the documentation is being updated at
>>
>> http://lucene.apache.org/solr/guide/6_6/index.html
>>
>> On Wed, Aug 30, 2017 at 10:03 AM, Bernd Fehling <
>> bernd.fehl...@uni-bielefeld.de> wrote:
>>
>>> Can someone fix https://cwiki.apache.org/confluence/ ?
>>>
>>> Seams to have problems with styles?
>>>
>>> Tons of #66solid and #66nonesolid in the text.
>>> E.g. :
>>> https://cwiki.apache.org/confluence/display/solr/
>>> Getting+Started+with+SolrCloud
>>>
>>> Thanks, Bernd
>>>
>>>


Re: Index relational database

2017-08-30 Thread Walter Underwood
Think about making a denormalized view, with all the fields needed in one 
table. That view gets sent to Solr. Each row is a Solr document.

It could be implemented as a view or as SQL, but that is a useful mental model 
for people starting from a relational background.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Aug 30, 2017, at 9:14 AM, Erick Erickson  wrote:
> 
> First, it's often best, by far, to denormalize the data in your solr index,
> that's what I'd explore first.
> 
> If you can't do that, the join query parser might work for you.
> 
> On Aug 30, 2017 4:49 AM, "Renuka Srishti" 
> wrote:
> 
>> Thanks Susheel for your response.
>> Here is the scenario about which I am talking:
>> 
>>   - Let suppose there are two documents doc1 and doc2.
>>   - I want to fetch the data from doc2 on the basis of doc1 fields which
>>   are related to doc2.
>> 
>> How to achieve this efficiently.
>> 
>> 
>> Thanks,
>> 
>> Renuka Srishti
>> 
>> 
>> On Mon, Aug 28, 2017 at 7:02 PM, Susheel Kumar 
>> wrote:
>> 
>>> Hello Renuka,
>>> 
>>> I would suggest to start with your use case(s). May be start with your
>>> first use case with the below questions
>>> 
>>> a) What is that you want to search (which fields like name, desc, city
>>> etc.)
>>> b) What is that you want to show part of search result (name, city etc.)
>>> 
>>> Based on above two questions, you would know what data to pull in from
>>> relational database and create solr schema and index the data.
>>> 
>>> You may first try to denormalize / flatten the structure so that you deal
>>> with one collection/schema and query upon it.
>>> 
>>> HTH.
>>> 
>>> Thanks,
>>> Susheel
>>> 
>>> On Mon, Aug 28, 2017 at 8:04 AM, Renuka Srishti <
>>> renuka.srisht...@gmail.com>
>>> wrote:
>>> 
 Hii,
 
 What is the best way to index relational database, and how it impacts
>> on
 the performance?
 
 Thanks
 Renuka Srishti
 
>>> 
>> 



Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Johannes Knaus
Thank you, Susheel, for the quick response.

So, that means that when I create a new collection, it shards will be newly 
created at each node, right?
Thus, if I have two collections with 
numShards=38, 
maxShardsPerNode=2 and 
replicationFactor=2 
on my 38 nodes, then this would result in each node "hosting" 4 shards (two 
from each collection).

If this is correct, I have two follow up questions:

1) As regards naming of the shards: Is using the same naming for the shards 
o.k. in this constellation? I.e. does it create trouble to have e.g. 
"Shard001", "Shard002", etc. in collection1 and "Shard001", "Shard002", etc. as 
well in collection2?

2) Performance: In my current single collection setup, I have 2 shards per 
node. After creating the second collection, there will be 4 shards per node. Do 
I have to edit the RAM per node value (raise the -m parameter when starting the 
node)? In my case, I am quite sure that the collections will never be queried 
simultaneously. So will the "running but idle" collection slow me down?

Johannes

-Ursprüngliche Nachricht-
Von: Susheel Kumar [mailto:susheel2...@gmail.com] 
Gesendet: Mittwoch, 30. August 2017 17:36
An: solr-user@lucene.apache.org
Betreff: Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same 
nodes possible?

Yes, absolutely.  You can create as many as collections you need (like you 
would create table in relational world).

On Wed, Aug 30, 2017 at 10:13 AM, Johannes Knaus  wrote:

> I have a working SolrCloud-Setup with 38 nodes with a collection 
> spanning over these nodes with 2 shards per node and replication 
> factor 2 and a router field.
>
> Now I got some new data for indexing which has the same structure and 
> size as my existing index in the described collection.
> However, although it has the same structure the new data to be indexed 
> should not be mixed with the old data.
>
> Do I have create another 38 new nodes and a new collection and index 
> the new data or is there a better / more efficient way I could use the 
> existing nodes?
> Is it possible that the 2 collections could share the 38 nodes without 
> the indexes being mixed?
>
> Thanks for your help.
>
> Johannes
>


Overseer task timeout

2017-08-30 Thread Mikhail Ibraheem
Hi,We have one node zookeeper and one no solr. Sometimes when trying to create 
or delete collection there is "SEVERE: 
null:org.apache.solr.common.SolrException: delete the collection time out:180s" 
error.
After checking the code I found that solr puts a task node to zookeeper 
/overseer/collection-queue-work/qnr-012764 
/overseer/collection-queue-work/qn-012764 then a watcher listen for this 
and process the task, then delete the response node which triggers the 
latchWatcher to notify the thread that the task finished. The timeout for this 
is 180 seconds (hard coded). I think that sometimes the watcher to trigger the 
processor not triggered? Is that a bug? How to fix that?
Please help.
ThanksMikhail

Re: Index relational database

2017-08-30 Thread Erick Erickson
First, it's often best, by far, to denormalize the data in your solr index,
that's what I'd explore first.

If you can't do that, the join query parser might work for you.

On Aug 30, 2017 4:49 AM, "Renuka Srishti" 
wrote:

> Thanks Susheel for your response.
> Here is the scenario about which I am talking:
>
>- Let suppose there are two documents doc1 and doc2.
>- I want to fetch the data from doc2 on the basis of doc1 fields which
>are related to doc2.
>
> How to achieve this efficiently.
>
>
> Thanks,
>
> Renuka Srishti
>
>
> On Mon, Aug 28, 2017 at 7:02 PM, Susheel Kumar 
> wrote:
>
> > Hello Renuka,
> >
> > I would suggest to start with your use case(s). May be start with your
> > first use case with the below questions
> >
> > a) What is that you want to search (which fields like name, desc, city
> > etc.)
> > b) What is that you want to show part of search result (name, city etc.)
> >
> > Based on above two questions, you would know what data to pull in from
> > relational database and create solr schema and index the data.
> >
> > You may first try to denormalize / flatten the structure so that you deal
> > with one collection/schema and query upon it.
> >
> > HTH.
> >
> > Thanks,
> > Susheel
> >
> > On Mon, Aug 28, 2017 at 8:04 AM, Renuka Srishti <
> > renuka.srisht...@gmail.com>
> > wrote:
> >
> > > Hii,
> > >
> > > What is the best way to index relational database, and how it impacts
> on
> > > the performance?
> > >
> > > Thanks
> > > Renuka Srishti
> > >
> >
>


Re: cwiki has problems ?

2017-08-30 Thread Erick Erickson
This has happened to several projects, so it's something
infrastructure related not specific to Solr's CWiki. We've raised a
ticket for infra to see fi they can find the root cause.

Cassandra and Hoss are trying to address the whole
CWiki-no-longer-current issue.

BTW, I find it useful to download the PDF (upper left corner) for
whatever version you want and search that locally. I only have 16
separate ones on my machine ;)

Best,
Erick

On Wed, Aug 30, 2017 at 8:18 AM, Susheel Kumar  wrote:
> Now the documentation is being updated at
>
> http://lucene.apache.org/solr/guide/6_6/index.html
>
> On Wed, Aug 30, 2017 at 10:03 AM, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de> wrote:
>
>> Can someone fix https://cwiki.apache.org/confluence/ ?
>>
>> Seams to have problems with styles?
>>
>> Tons of #66solid and #66nonesolid in the text.
>> E.g. :
>> https://cwiki.apache.org/confluence/display/solr/
>> Getting+Started+with+SolrCloud
>>
>> Thanks, Bernd
>>
>>


Re: SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Susheel Kumar
Yes, absolutely.  You can create as many as collections you need (like you
would create table in relational world).

On Wed, Aug 30, 2017 at 10:13 AM, Johannes Knaus  wrote:

> I have a working SolrCloud-Setup with 38 nodes with a collection spanning
> over these nodes with 2 shards per node and replication factor 2 and a
> router field.
>
> Now I got some new data for indexing which has the same structure and size
> as my existing index in the described collection.
> However, although it has the same structure the new data to be indexed
> should not be mixed with the old data.
>
> Do I have create another 38 new nodes and a new collection and index the
> new data or is there a better / more efficient way I could use the existing
> nodes?
> Is it possible that the 2 collections could share the 38 nodes without the
> indexes being mixed?
>
> Thanks for your help.
>
> Johannes
>


Re: cwiki has problems ?

2017-08-30 Thread Susheel Kumar
Now the documentation is being updated at

http://lucene.apache.org/solr/guide/6_6/index.html

On Wed, Aug 30, 2017 at 10:03 AM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> Can someone fix https://cwiki.apache.org/confluence/ ?
>
> Seams to have problems with styles?
>
> Tons of #66solid and #66nonesolid in the text.
> E.g. :
> https://cwiki.apache.org/confluence/display/solr/
> Getting+Started+with+SolrCloud
>
> Thanks, Bernd
>
>


Re: install_solr_service.sh issues with SUSE SLES 12.1, 12.2

2017-08-30 Thread Adrian H

Thanks, sorry I missed the issue in JIRA.

I think the second issue still stands and is unrelated - it's only 
related in the sense that it affects the same file.



2) only on SLES 12.2 (and Leap 42.2) and possibly future versions:

related to the following change:
https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12-SP2/#fate-320973

On a fresh install, the command "service solr start" will result in the
message: solr is neither service nor target!?

adding "systemctl daemon-reload" before starting the service fixes this
issue, however, I'm not sure if this is the right approach.





On 08/30/2017 03:44 PM, Susheel Kumar wrote:

I had this opened https://issues.apache.org/jira/browse/SOLR-10932 earlier
and discussion link

http://lucene.472066.n3.nabble.com/install-solr-service-possible-bug-td4340502.html


We shall put a fix for this as Shawn suggested.



On Wed, Aug 30, 2017 at 9:02 AM, Adrian H  wrote:






SolrCloud indexing -- 2 collections, 2 indexes, sharing the same nodes possible?

2017-08-30 Thread Johannes Knaus
I have a working SolrCloud-Setup with 38 nodes with a collection spanning over 
these nodes with 2 shards per node and replication factor 2 and a router field.

Now I got some new data for indexing which has the same structure and size as 
my existing index in the described collection.
However, although it has the same structure the new data to be indexed should 
not be mixed with the old data.

Do I have create another 38 new nodes and a new collection and index the new 
data or is there a better / more efficient way I could use the existing nodes?
Is it possible that the 2 collections could share the 38 nodes without the 
indexes being mixed?

Thanks for your help.

Johannes


cwiki has problems ?

2017-08-30 Thread Bernd Fehling
Can someone fix https://cwiki.apache.org/confluence/ ?

Seams to have problems with styles?

Tons of #66solid and #66nonesolid in the text.
E.g. :
https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud

Thanks, Bernd



Re: install_solr_service.sh issues with SUSE SLES 12.1, 12.2

2017-08-30 Thread Susheel Kumar
I had this opened https://issues.apache.org/jira/browse/SOLR-10932 earlier
and discussion link

http://lucene.472066.n3.nabble.com/install-solr-service-possible-bug-td4340502.html


We shall put a fix for this as Shawn suggested.



On Wed, Aug 30, 2017 at 9:02 AM, Adrian H  wrote:

> hi all
>
> I've installed Solr 6.6.0 (and older versions) on a couple of SUSE servers
> and ran into the following issues with the service installer script:
>
> 1) on both SLES 12.1 and 12.2:
> line 196:  service --version &>/dev/null || print_error "Script requires
> the 'service' command"
>
> service --version
> exits with an exit code of 1 so the script stops there.  There is no
> --version option in the SUSE packaged service command, so it's an error.
>
> changing this to:
> service --help
> resolves the issue as it exits with a 0. I can confirm that --help also
> exits with a 0 on my debian 9 system, but I don't know about the others.
>
> 2) only on SLES 12.2 (and Leap 42.2) and possibly future versions:
>
> related to the following change:
> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12-SP2/#fate-320973
>
> On a fresh install, the command "service solr start" will result in the
> message: solr is neither service nor target!?
>
> adding "systemctl daemon-reload" before starting the service fixes this
> issue, however, I'm not sure if this is the right approach.
>
>
> I'm new to this mailing list and project, so I don't know if these are
> issues which should be created in JIRA or first discussed here.
>
> cheers
> Adrian
>


Re: Solr memory leak

2017-08-30 Thread Hendrik Haddorp
Did you get an answer? Would really be nice to have that in the next 
release.


On 28.08.2017 18:31, Erick Erickson wrote:

Varun Thacker is the RM for Solr 6.6.1, I've pinged him about including it.

On Mon, Aug 28, 2017 at 8:52 AM, Walter Underwood  wrote:

That would be a really good reason for a 6.7.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)



On Aug 28, 2017, at 8:48 AM, Markus Jelsma  wrote:

It is, unfortunately, not committed for 6.7.





-Original message-

From:Markus Jelsma 
Sent: Monday 28th August 2017 17:46
To: solr-user@lucene.apache.org
Subject: RE: Solr memory leak

See https://issues.apache.org/jira/browse/SOLR-10506
Fixed for 7.0

Markus



-Original message-

From:Hendrik Haddorp 
Sent: Monday 28th August 2017 17:42
To: solr-user@lucene.apache.org
Subject: Solr memory leak

Hi,

we noticed that triggering collection reloads on many collections has a
good chance to result in an OOM-Error. To investigate that further I did
a simple test:
 - Start solr with a 2GB heap and 1GB Metaspace
 - create a trivial collection with a few documents (I used only 2
fields and 100 documents)
 - trigger a collection reload in a loop (I used SolrJ for this)

Using Solr 6.3 the test started to fail after about 250 loops. Solr 6.6
worked better but also failed after 1100 loops.

When looking at the memory usage on the Solr dashboard it looks like the
space left after GC cycles gets less and less. Then Solr gets very slow,
as the JVM is busy with the GC. A bit later Solr gets an OOM-Error. In
my last run this was actually for the Metaspace. So it looks like more
and more heap and metaspace is being used by just constantly reloading a
trivial collection.

regards,
Hendrik





Different ideas for querying unique and non-unique records

2017-08-30 Thread Susheel Kumar
Hello,

I am looking for different ideas/suggestions to solve the use case am
working on.

We have couple of fields in schema along with id, business_email and
personal_email.  We need to return all records based on unique business and
personal email's.

The criteria for unique records is either of business or personal email has
not repeated again in other records.
The criteria for non-unique records is if any of the business or personal
email has occurred/repeats in other records then all those records are
non-unique.
E.g considering below documents.
- for unique records below only id=1 should be returned (since john.doe is
not present in any other records personal or business email)
- non unique records, below id=2,3 should be returned (since isabel.dora is
present in multiple records. doesn't matter if it is present in business or
personal email)

Documents
===
{id:1,business_email_s:john@abc.com,personal_email_s:john@abc.com}
{id:2,business_email_s:isabel.d...@abc.com}
{id:3,personal_email_s:isabel.d...@abc.com}

I am able to solve this using Streaming expression query but not sure if
performance will become an bottleneck as the streaming expression is quite
big. So looking for
different ideas like using de-dupe or during ingestion/pre-process etc.
without impacting performance much.

Thanks,
Susheel


install_solr_service.sh issues with SUSE SLES 12.1, 12.2

2017-08-30 Thread Adrian H

hi all

I've installed Solr 6.6.0 (and older versions) on a couple of SUSE 
servers and ran into the following issues with the service installer script:


1) on both SLES 12.1 and 12.2:
line 196:  service --version &>/dev/null || print_error "Script requires 
the 'service' command"


service --version
exits with an exit code of 1 so the script stops there.  There is no 
--version option in the SUSE packaged service command, so it's an error.


changing this to:
service --help
resolves the issue as it exits with a 0. I can confirm that --help also 
exits with a 0 on my debian 9 system, but I don't know about the others.


2) only on SLES 12.2 (and Leap 42.2) and possibly future versions:

related to the following change:
https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12-SP2/#fate-320973

On a fresh install, the command "service solr start" will result in the 
message: solr is neither service nor target!?


adding "systemctl daemon-reload" before starting the service fixes this 
issue, however, I'm not sure if this is the right approach.



I'm new to this mailing list and project, so I don't know if these are 
issues which should be created in JIRA or first discussed here.


cheers
Adrian


Re: Solr index getting replaced instead of merged

2017-08-30 Thread Gurdeep Singh
Not sure how you are doing indexing. Try adding clean=false in your indexing 
command/script when you do second table indexing.





> On 30 Aug 2017, at 7:06 PM, Agrawal, Harshal (GE Digital) 
>  wrote:
> 
> Hello Guys,
> 
> I have installed solr in my local system and was able to connect to Teradata 
> successfully.
> For single table I am able to index the data and query it also but when I am 
> trying for multiple tables in the same schema and doing indexing one by one 
> respectively.
> I can see datasets getting replaced instead of merged .
> 
> Can anyone help me please:
> 
> Regards
> Harshal
> 
> 


Solr index getting replaced instead of merged

2017-08-30 Thread Agrawal, Harshal (GE Digital)
Hello Guys,

I have installed solr in my local system and was able to connect to Teradata 
successfully.
For single table I am able to index the data and query it also but when I am 
trying for multiple tables in the same schema and doing indexing one by one 
respectively.
I can see datasets getting replaced instead of merged .

Can anyone help me please:

Regards
Harshal




Re: Index relational database

2017-08-30 Thread Renuka Srishti
Thanks Susheel for your response.
Here is the scenario about which I am talking:

   - Let suppose there are two documents doc1 and doc2.
   - I want to fetch the data from doc2 on the basis of doc1 fields which
   are related to doc2.

How to achieve this efficiently.


Thanks,

Renuka Srishti


On Mon, Aug 28, 2017 at 7:02 PM, Susheel Kumar 
wrote:

> Hello Renuka,
>
> I would suggest to start with your use case(s). May be start with your
> first use case with the below questions
>
> a) What is that you want to search (which fields like name, desc, city
> etc.)
> b) What is that you want to show part of search result (name, city etc.)
>
> Based on above two questions, you would know what data to pull in from
> relational database and create solr schema and index the data.
>
> You may first try to denormalize / flatten the structure so that you deal
> with one collection/schema and query upon it.
>
> HTH.
>
> Thanks,
> Susheel
>
> On Mon, Aug 28, 2017 at 8:04 AM, Renuka Srishti <
> renuka.srisht...@gmail.com>
> wrote:
>
> > Hii,
> >
> > What is the best way to index relational database, and how it impacts on
> > the performance?
> >
> > Thanks
> > Renuka Srishti
> >
>


Re: Recommended Python Library for Complex Querying?

2017-08-30 Thread Leonardo Perez Pulido
Hi,
Maybe this can help:
http://lucene.apache.org/pylucene/
Regards.


On Wed, Aug 30, 2017 at 2:44 AM, ron visbord  wrote:

> Hi all,
>
> I'm rebuilding the Solr part of my search engine in Python.
>
> I work with both Solr 5[.2.1] and 6[.3.0].
>
> I looked around but found no satisfactory python library to help me write
> queries. I come from SolrJ so I'm used to having classes for all types or
> queries.
>
> What would you say is the most "powerful" and up-to-date python library for
> Solr?
>
> Or will I be forced to hand-craft the lucene syntax?
>
> Thanks in advanced,
> Ron
>


geodist function

2017-08-30 Thread Maruska Melucci
Hi
I'm using solr for geospatial query.
I need to obtain the nearest linestring to my point, I'm using the query
below but I need  results ordered by distance.

http://localhost:8983/solr/address_search/select?d=0.1={!geofilt%20sfield=geometry}=on=43.916938,12.912577=*:*=10=geometry=0=json

I'm trying to use the function geodist as sort parameter but when I add
"=geodist()%20asc" the application hang on until it crashes with error:
"java.lang.OutOfMemoryError: Java heap space"

The field geometry is a MULTILINESTRING of type location_rpt


I'm using solr 6.5

Someone can help me?

Best Regards
Maruska



 From log:
null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:676)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:544)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space


Recommended Python Library for Complex Querying?

2017-08-30 Thread ron visbord
Hi all,

I'm rebuilding the Solr part of my search engine in Python.

I work with both Solr 5[.2.1] and 6[.3.0].

I looked around but found no satisfactory python library to help me write
queries. I come from SolrJ so I'm used to having classes for all types or
queries.

What would you say is the most "powerful" and up-to-date python library for
Solr?

Or will I be forced to hand-craft the lucene syntax?

Thanks in advanced,
Ron