how to range search on the field which contains multiple decimal point (eg: 2.5.0.4)

2016-05-04 Thread Santhosh Sheshasayanan

Hi,

I have issue in my server. As I stated in the subject I want to do range search 
query on the field (eg: filed name is "version") which contains value like 
(2.5.0.1, 2.5.0.4 and 2.5.0.10 etc).

When I do range search on the "version" field with criteria [* TO 2.5.0.5], it 
gave me all the value like (2.5.0.1, 2.5.0.10, 2.5.0.4). But this is wrong 
result. Since I was expecting only 2.5.0.1 and 2.5.0.4.
But it include 2.5.0.10 with the results. I googled and found that solr does 
lexical sorting. But I want numerical sorting. I declared the field type as 
string in schema.xml.

I did the following solution but nothing worked.

* Converted the field type to number. But it gave me 
"NumberFormatException".  Because java does not allow multiple decimal point.

* I added left pad 0 with the value while adding document in solr. But 
no luck

Can you please give me good solution to come out of the issue?


Regards,
Santhosh S




Disclaimer:  This message and the information contained herein is proprietary 
and confidential and subject to the Tech Mahindra policy statement, you may 
review the policy at http://www.techmahindra.com/Disclaimer.html externally 
http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra.




Advice to add additional non-related fields to a collection or create a subset of it?

2016-05-04 Thread Derek Poh

Hi

We have a "product" collection and a "supplier" collection.
The "product" collection contains products information and "supplier" 
collection contains the product's suppliers information.
We have a subsidiary page that query on "product" collection for the 
search. The display result include product and supplier information.
This page will query the "product" collection to get the matching 
product records.
From this query a list of the matching product's supplier id is 
extracted and used in a filter query against the "supplier" collection 
to get the necessary supplier's information.


The loading of this page is very slow, it leads to timeout at times as 
well. Beside looking at tweaking the codes of the page we are also 
looking at what tweaking can be done on solr side. Reducing the number 
of queries generated bythis page was one of the optionto try.


The main "product" collection is also use by our site main search page 
and other subsidiary pages as well. So the query load on it is substantial.

It has about 6.5 million documents and index size of 38-39 GB.
It is setup as 1 shard with 5 replicas. Each replica is on it's own 
server. Total of 5 servers.
There are other smaller collections with similar 1 shard 5 replicas 
setup residing on these servers as well.


I am thinking of either
1. Index supplier information into the "product" collection.
2. Create another similar "product" collection for this page to use. 
This collection will have lesser product fields and will include the 
required supplier fields. But the number of documents in it will be the 
same as the main "product" collection. The index size will be smallerthough.


With either 2 options we do not need to query "supplier" collection. So 
there is one less query and hopefully it will improve the performance of 
this page.


What is the advise between the 2 options?
Any other advice or options?

Derek

--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

RE: Query String Limit

2016-05-04 Thread Prasanna S. Dhakephalkar
Hi

We had increased the maxBooleanClauses to a large number, but it did not
work

Here is the query

http://localhost:8983/solr/collection1/select?fq=record_id%3A(604929+504197+
500759+510957+624719+524081+544530+375687+494822+468221+553049+441998+495212
+462613+623866+344379+462078+501936+189274+609976+587180+620273+479690+60601
8+487078+496314+497899+374231+486707+516582+74518+479684+1696152+1090711+396
784+377205+600603+539686+550483+436672+512228+1102968+600604+487699+612271+4
87978+433952+479846+492699+380838+412290+487086+515836+487957+525335+495426+
619724+49726+444558+67422+368749+630542+473638+613887+1679503+509367+1108299
+498818+528683+530270+595087+468595+585998+487888+600612+515884+455568+60643
8+526281+497992+460147+587530+576456+526021+790508+486148+469160+365923+4846
54+510829+488792+610933+254610+632700+522376+594418+514817+439283+1676569+52
4031+431557+521628+609255+627205+1255921+57+477017+519675+548373+350309+
491176+524276+570935+549458+495765+512814+494722+382249+619036+477309+487718
+470604+514622+1240902+570607+613830+519130+479708+630293+496994+623870+5706
72+390434+483496+609115+490875+443859+292168+522383+501802+606498+596773+479
881+486020+488654+490422+512636+495512+489480+626269+614618+498967+476988+47
7608+486568+270095+295480+478367+607120+583892+593474+494373+368030+484522+5
01183+432822+448109+553418+584084+614868+486206+481014+495027+501880+479113+
615208+488161+512278+597663+569409+139097+489490+584000+493619+607479+281080
+518617+518803+487896+719003+584153+484341+505689+278177+539722+548001+62529
6+1676456+507566+619039+501882+530385+474125+293642+612857+568418+640839+519
893+524335+612859+618762+479460+479719+593700+573677+525991+610965+462087+52
1251+501197+443642+1684784+533972+510695+475499+490644+613829+613893+479467+
542478+1102898+499230+436921+458632+602303+488468+1684407+584373+494603+4992
45+548019+600436+606997+59+503156+440428+518759+535013+548023+494273+649
062+528704+469282+582249+511250+496466+497675+505937+489504+600444+614240+19
35577+464232+522398+613809+1206232+607149+607644+498059+506810+487115+550976
+638174+600849+525655+625011+500082+606336+507156+487887+333601+457209+60111
0+494927+1712081+601280+486061+501558+600451+263864+527378+571918+472415+608
130+212386+380460+590400+478850+631886+486782+608013+613824+581767+527023+62
3207+607013+505819+485418+486786+537626+507047+92+527473+495520+553141+5
17837+497295+563266+495506+532725+267057+497321+453249+524341+429654+720001+
539946+490813+479491+479628+479630+1125985+351147+524296+565077+439949+61241
3+495854+479493+1647796+600259+229346+492571+485638+596394+512112+477237+600
459+263780+704068+485934+450060+475944+582280+488031+1094010+1687904+539515+
525820+539516+505985+600461+488991+387733+520928+362967+351847+531586+616101
+479925+494156+511292+515729+601903+282655+491244+610859+486081+325500+43639
7+600708+523445+480737+486083+614767+486278+1267655+484845+495145+562624+493
381+8060+638731+501347+565979+325132+501363+268866+614113+479646+1964487+631
934+25717+461612+376451+513712+527557+459209+610194+1938903+488861+426305+47
7676+1222682+1246647+567986+501908+791653+325802+498354+435156+484862+533068
+339875+395827+475148+331094+528741+540715+623480+416601+516419+600473+62563
2+480570+447412+449778+503316+492365+563298+486361+500907+514521+138405+6123
27+495344+596879+524918+474563+47273+514739+553189+548418+448943+450612+6006
78+484753+485302+271844+474199+487922+473784+431524+535371+513583+514746+612
534+327470+485855+517878+384102+485856+612768+494791+504840+601330+493551+55
8620+540131+479809+394179+487866+559955+578444+576571+485861+488879+573089+4
97552+487898+490369+535756+614155+633027+487473+517912+523364+527419+600487+
486128+278040+598478+487395+600579+585691+498970+488151+608187+445943+631971
+230291+504552+534443+501924+489148+292672+528874+434783+479533+485301+61908
9+629083+479383+600981+534717+645420+604921+618714+522329+597822+507413+5706
05+491732+464741+511564+613929+526049+614817+589065+603307+491990+467339+264
426+487907+492982+589067+487674+487820+492983+486708+504140+1216198+625736+4
92984+530116+615663+503248+1896822+600588+518139+494994+621846+599669+488207
+640923+487580+539856+603968+444717+492991+614824+491735+492992+495149+52117
2+365778+261681+600502+479682+597464+492997+587172+624381+482355+1246338+593
642+492000+494707+620137+493000+20617+585199+587176+587177+1877064+587179+53
3478+606061+647089+612257+558521+612259+612261+612264+612266+612268+612273+6
12274+612275+612276+612278+612279+1414843+883571+206887+147419+617296+547518
+547519+524791+541892+541895+541943+541945+541947+34708+638171+589724+602793
+593074+614611+570608+614612+606821+614613+614614+490421+614615+619479+61461
6+1898943+1898942+1898945+614619+1898944+614620+614621+614622+614624+615204+
614625+615205+615206+615207+529065+293239+615209+623525+526605+610560+610562
+531607+615211+561824+615212+618273+490249+588274+615213+618275+547994+18802

Re: Migrating from Solr 5.4 to Solr 6.0

2016-05-04 Thread Zheng Lin Edwin Yeo
Thank you.

That would save us quite alot of time, as we are worried that our current
index will not be compatible with the new BM25 scoring algorithm.

Regards,
Edwin


On 4 May 2016 at 19:50, Markus Jelsma  wrote:

> No, you don't need to reindex.
> M.
>
> -Original message-
> > From:Zheng Lin Edwin Yeo 
> > Sent: Wednesday 4th May 2016 13:27
> > To: solr-user@lucene.apache.org
> > Subject: Migrating from Solr 5.4 to Solr 6.0
> >
> > Hi,
> >
> > Would like to find out, do we need to re-index our document when we
> migrate
> > from Solr 5.4 to Solr 6.0 because of the change in scoring algorithm to
> > BM25?
> >
> > Regards,
> > Edwin
> >
>


Using MoreLikeThis for multiple documents/keywords

2016-05-04 Thread Zheng Lin Edwin Yeo
Hi,

I would like to find out, if it is possible to use MoreLikeThis to get the
response and interesting terms based on 2 or more multiple documents or
keywords, by adding the AND or OR parameters in the query like what we do
during search?

For example:
http://localhost:8983/solr/collection1/mlt?q=id:collection1_0001 OR
id:collection1_0002=0=10

http://localhost:8983/solr/collection1/mlt?q=keyword1 AND
keyword2=0=10

I'm using Solr 5.4.0

Regards,
Edwin


Re: MoreLikeThis Component - how to get fields of documents

2016-05-04 Thread Zheng Lin Edwin Yeo
Hi Jan,

Which version of Solr are you using?

Regards,
Edwin


On 26 April 2016 at 23:46, Dr. Jan Frederik Maas <
jan.m...@sub.uni-hamburg.de> wrote:

> Hello,
>
> I want to use the moreLikeThis Component to get similar documents from a
> sharded SOLR. This works quite well except for the fact that the documents
> in the moreLikeThis-list only contain the id/unique key of the documents.
>
> Is it possible to get the other fields? I can of course do another query
> for the given IDs, but this would be complicated and very slow.
>
> For example:
>
>
> http://mysolrsystem/?q=id:524507260=true=title=0=true=true=title,id,topic
>
> creates
>
> (...)
> 
> 
> 646199803
> 613210832
> 562239472
> 819200034
> 539877271
> 
> (...)
>
> I tried to modify the fl-parameter, but I can only switch the ID-field in
> the moreLikeThis-Documents on and off (the latter resulting in empty
> document tags). In the result list however, the fl-fields are shown as
> specified.
>
> I would be very grateful for help.
>
> Best wishes,
> Jan Maas
>


Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Susheel Kumar
Thanks, Nick & Hoss.  I am using the exact same machine, have wiped out
solr 5.5.0 and installed solr-6.0.0 with external ZK 3.4.8.  I checked the
File Description limit for user solr, which is 12000 and increased to
52000. Don't see "too many files open..." error now in Solr log but still
Solr connection getting lost in Admin panel.

Let me do some more tests and install older version back to confirm and
will share the findings.

Thanks,
Susheel

On Wed, May 4, 2016 at 8:11 PM, Chris Hostetter 
wrote:

>
> : Thanks, Nick. Do we know any suggested # for file descriptor limit with
> : Solr6?  Also wondering why i haven't seen this problem before with Solr
> 5.x?
>
> are you running Solr6 on the exact same host OS that you were running
> Solr5 on?
>
> even if you are using the "same OS version" on a diff machine, that could
> explain the discrepency if you (or someone else) increased the file
> descriptor limit on the "old machine" but that neverh appened on the 'new
> machine"
>
>
>
> : On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev 
> : wrote:
> :
> : > It looks like you have too many open files, try increasing the file
> : > descriptor limit.
> : >
> : > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar 
> : > wrote:
> : >
> : > > Hello,
> : > >
> : > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and
> used
> : > the
> : > > install service to setup solr.
> : > >
> : > > After launching Solr Admin Panel on server1, it looses connections
> in few
> : > > seconds and then comes back and other node server2 is marked as Down
> in
> : > > cloud graph. After few seconds its loosing the connection and comes
> back.
> : > >
> : > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK
> 3.4.8.
> : > > Have never seen this error before with solr 5.x with ZK 3.4.6.
> : > >
> : > > Below log from server1 & server2.  The ZK has 3 nodes with chroot
> : > enabled.
> : > >
> : > > Thanks,
> : > > Susheel
> : > >
> : > > server1/solr.log
> : > >
> : > > 
> : > >
> : > >
> : > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> : > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> : > > [configName]=[collection1] specified config exists in ZooKeeper
> : > >
> : > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
> : > o.a.s.s.HttpSolrCall
> : > > [admin] webapp=null path=/admin/collections
> : > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0
> QTime=25
> : > >
> : > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> params
> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> : > >
> : > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
> : > o.a.s.s.HttpSolrCall
> : > > [admin] webapp=null path=/admin/collections
> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
> : > >
> : > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
> : > o.a.s.s.HttpSolrCall
> : > > [admin] webapp=null path=/admin/cores
> : > > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
> : > >
> : > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
> : > o.a.s.s.HttpSolrCall
> : > > [admin] webapp=null path=/admin/info/system
> : > > params={wt=json&_=1462389588126} status=0 QTime=25
> : > >
> : > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> params
> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> : > >
> : > > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
> : > o.a.s.s.HttpSolrCall
> : > > [admin] webapp=null path=/admin/collections
> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=3
> : > >
> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> : > caught
> : > > when connecting to {}->http://server2:8983: Too many open files
> : > >
> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> : > caught
> : > > when connecting to {}->http://server2:8983: Too many open files
> : > >
> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> : > caught
> : > > when connecting to {}->http://server2:8983: Too many open files
> : > >
> : > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> : > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> http://server2:8983
> : > >
> : > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> : > caught
> : > > when connecting to {}->http://server2:8983: Too many open files
> : > >
> : > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> : > > 

Re: Bug in Solr 6 dynamic-fields?

2016-05-04 Thread Alexandre Rafalovitch
I've just answered this on SO, but I think the reason is quite
confusing and hope others can comment here

Basically, the issue is that "string" field type has docValues enabled
and therefore with schema 1.6, any field inherited from it will be
effectively both searchable and returnable, even if stored/indexed are
explicitly set to false in the field definition.

I am not sure when the string field type was made into docValues, but
I suspect we may have a lot of confusion here as a result. Especially,
since the Admin UI does not show or allow to override values set on
the field type level.

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 5 May 2016 at 04:35, Tech Id  wrote:
> Hi,
>
> We are unable to resolve a problem with dynamic fields in Solr 6.
> The question and details can be found on stack-overflow at
> http://stackoverflow.com/questions/37014345/unable-to-add-new-dynamic-fields-in-solr-6-0/37018450#37018450
>
> If its a real bug, then we can file a JIRA for the same.
>
> Appreciate any help !
> Thanks
> TiD


RE: Integrating grobid with Tika in solr

2016-05-04 Thread Allison, Timothy B.
Y, integrating Tika is non-trivial.  I think Uwe adds the dependencies with 
great care by hand by carefully looking at the dependency tree in Maven and 
making sure there weren't any conflicts.


-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Wednesday, May 4, 2016 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Integrating grobid with Tika in solr

On 5/4/2016 9:21 AM, Betsey Benagh wrote:
> I’m feeling particularly dense, because I don’t see any Tika jars in
> WEB-INF/lib:

Oops. Sorry about that, I forgot that it's all contrib.  That's my mistake, not 
yours.

The Tika jars are in contrib/extraction/lib, along with a very large number of 
dependencies.

It turns out that I probably have no idea what I'm talking about.  I cannot 
find any version 1.12 downloads on Tika's website that are structured the same 
way as what's in our contrib directory, so I have no idea how to actually do 
the manual upgrade.

I seem to remember hearing about people doing a Tika upgrade manually, but I've 
got no idea how they did it.

Thanks,
Shawn



Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Chris Hostetter

: Thanks, Nick. Do we know any suggested # for file descriptor limit with
: Solr6?  Also wondering why i haven't seen this problem before with Solr 5.x?

are you running Solr6 on the exact same host OS that you were running 
Solr5 on?

even if you are using the "same OS version" on a diff machine, that could 
explain the discrepency if you (or someone else) increased the file 
descriptor limit on the "old machine" but that neverh appened on the 'new 
machine"



: On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev 
: wrote:
: 
: > It looks like you have too many open files, try increasing the file
: > descriptor limit.
: >
: > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar 
: > wrote:
: >
: > > Hello,
: > >
: > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used
: > the
: > > install service to setup solr.
: > >
: > > After launching Solr Admin Panel on server1, it looses connections in few
: > > seconds and then comes back and other node server2 is marked as Down in
: > > cloud graph. After few seconds its loosing the connection and comes back.
: > >
: > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK 3.4.8.
: > > Have never seen this error before with solr 5.x with ZK 3.4.6.
: > >
: > > Below log from server1 & server2.  The ZK has 3 nodes with chroot
: > enabled.
: > >
: > > Thanks,
: > > Susheel
: > >
: > > server1/solr.log
: > >
: > > 
: > >
: > >
: > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
: > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
: > > [configName]=[collection1] specified config exists in ZooKeeper
: > >
: > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
: > o.a.s.s.HttpSolrCall
: > > [admin] webapp=null path=/admin/collections
: > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25
: > >
: > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
: > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
: > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
: > >
: > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
: > o.a.s.s.HttpSolrCall
: > > [admin] webapp=null path=/admin/collections
: > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
: > >
: > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
: > o.a.s.s.HttpSolrCall
: > > [admin] webapp=null path=/admin/cores
: > > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
: > >
: > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
: > o.a.s.s.HttpSolrCall
: > > [admin] webapp=null path=/admin/info/system
: > > params={wt=json&_=1462389588126} status=0 QTime=25
: > >
: > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
: > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
: > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
: > >
: > > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
: > o.a.s.s.HttpSolrCall
: > > [admin] webapp=null path=/admin/collections
: > > params={action=LIST=json&_=1462389588125} status=0 QTime=3
: > >
: > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
: > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
: > caught
: > > when connecting to {}->http://server2:8983: Too many open files
: > >
: > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
: > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
: > caught
: > > when connecting to {}->http://server2:8983: Too many open files
: > >
: > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
: > caught
: > > when connecting to {}->http://server2:8983: Too many open files
: > >
: > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
: > >
: > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
: > caught
: > > when connecting to {}->http://server2:8983: Too many open files
: > >
: > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
: > >
: > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
: > caught
: > > when connecting to {}->http://server2:8983: Too many open files
: > >
: > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
: > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
: > >
: > > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5983) [   ]
: > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
: > >
: > > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5980) [   ]
: > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
: > >
: > > 2016-05-04 

getZkStateReader() returning NULL

2016-05-04 Thread Boman
I am attempting to check for existence of a collection prior to creating a
new one with that name, using Solrj:

System.out.println("Checking for existence of collection...");
ZkStateReader zkStateReader = this.server.getZkStateReader(); 
zkStateReader.updateClusterState();

this.server was created using:

   this.server = new CloudSolrClient(this.ZK_HOST);

The call: this.server.getZkStateReader() consistently returns a NULL.

Any help would be appreciated. Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/getZkStateReader-returning-NULL-tp4274663.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: OOM script executed

2016-05-04 Thread Chris Hostetter

: You could, but before that I'd try to see what's using your memory and see
: if you can decrease that. Maybe identify why you are running OOM now and
: not with your previous Solr version (assuming you weren't, and that you are
: running with the same JVM settings). A bigger heap usually means more work
: to the GC and less memory available for the OS cache.

FWIW: One of the bugs fixed in 6.0 was regarding the fact that the 
oom_killer wasn't being called properly on OOM -- so the fact that you are 
getting OOMErrors in 6.0 may not actually be a new thing, it may just be 
new that you are being made aware of them by the oom_killer

https://issues.apache.org/jira/browse/SOLR-8145

That doesn't negate Tomás's excelent advice about trying to determine
what is causing the OOM, but i wouldn't get too hung up on "what changed" 
between 5.x and 6.0 -- possibly nothing other then "now you know about 
it."



: 
: Tomás
: 
: On Sun, May 1, 2016 at 11:20 PM, Bastien Latard - MDPI AG <
: lat...@mdpi.com.invalid> wrote:
: 
: > Hi Guys,
: >
: > I got several times the OOM script executed since I upgraded to Solr6.0:
: >
: > $ cat solr_oom_killer-8983-2016-04-29_15_16_51.log
: > Running OOM killer script for process 26044 for Solr on port 8983
: >
: > Does it mean that I need to increase my JAVA Heap?
: > Or should I do anything else?
: >
: > Here are some further logs:
: > $ cat solr_gc_log_20160502_0730:
: > }
: > {Heap before GC invocations=1674 (full 91):
: >  par new generation   total 1747648K, used 1747135K [0x0005c000,
: > 0x00064000, 0x00064000)
: >   eden space 1398144K, 100% used [0x0005c000, 0x00061556,
: > 0x00061556)
: >   from space 349504K,  99% used [0x00061556, 0x00062aa2fc30,
: > 0x00062aab)
: >   to   space 349504K,   0% used [0x00062aab, 0x00062aab,
: > 0x00064000)
: >  concurrent mark-sweep generation total 6291456K, used 6291455K
: > [0x00064000, 0x0007c000, 0x0007c000)
: >  Metaspace   used 39845K, capacity 40346K, committed 40704K, reserved
: > 1085440K
: >   class spaceused 4142K, capacity 4273K, committed 4368K, reserved
: > 1048576K
: > 2016-04-29T21:15:41.970+0200: 20356.359: [Full GC (Allocation Failure)
: > 2016-04-29T21:15:41.970+0200: 20356.359: [CMS:
: > 6291455K->6291456K(6291456K), 12.5694653 secs]
: > 8038591K->8038590K(8039104K), [Metaspace: 39845K->39845K(1085440K)],
: > 12.5695497 secs] [Times: user=12.57 sys=0.00, real=12.57 secs]
: >
: >
: > Kind regards,
: > Bastien
: >
: >
: 

-Hoss
http://www.lucidworks.com/

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Nick Vasilyev
Not sure about your environment so it's hard to say why you haven't ran
into this issue before.

As for the suggested limit, I am not sure, it would depend on your system
and if you really want to limit it. I personally just jack it up to 5.

On Wed, May 4, 2016 at 6:13 PM, Susheel Kumar  wrote:

> Thanks, Nick. Do we know any suggested # for file descriptor limit with
> Solr6?  Also wondering why i haven't seen this problem before with Solr
> 5.x?
>
> On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev 
> wrote:
>
> > It looks like you have too many open files, try increasing the file
> > descriptor limit.
> >
> > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar 
> > wrote:
> >
> > > Hello,
> > >
> > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used
> > the
> > > install service to setup solr.
> > >
> > > After launching Solr Admin Panel on server1, it looses connections in
> few
> > > seconds and then comes back and other node server2 is marked as Down in
> > > cloud graph. After few seconds its loosing the connection and comes
> back.
> > >
> > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK 3.4.8.
> > > Have never seen this error before with solr 5.x with ZK 3.4.6.
> > >
> > > Below log from server1 & server2.  The ZK has 3 nodes with chroot
> > enabled.
> > >
> > > Thanks,
> > > Susheel
> > >
> > > server1/solr.log
> > >
> > > 
> > >
> > >
> > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> > > [configName]=[collection1] specified config exists in ZooKeeper
> > >
> > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
> > o.a.s.s.HttpSolrCall
> > > [admin] webapp=null path=/admin/collections
> > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25
> > >
> > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> params
> > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> > >
> > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
> > o.a.s.s.HttpSolrCall
> > > [admin] webapp=null path=/admin/collections
> > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
> > >
> > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
> > o.a.s.s.HttpSolrCall
> > > [admin] webapp=null path=/admin/cores
> > > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
> > >
> > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
> > o.a.s.s.HttpSolrCall
> > > [admin] webapp=null path=/admin/info/system
> > > params={wt=json&_=1462389588126} status=0 QTime=25
> > >
> > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> params
> > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> > >
> > > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
> > o.a.s.s.HttpSolrCall
> > > [admin] webapp=null path=/admin/collections
> > > params={action=LIST=json&_=1462389588125} status=0 QTime=3
> > >
> > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
> > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> > caught
> > > when connecting to {}->http://server2:8983: Too many open files
> > >
> > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
> > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> > caught
> > > when connecting to {}->http://server2:8983: Too many open files
> > >
> > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> > caught
> > > when connecting to {}->http://server2:8983: Too many open files
> > >
> > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> http://server2:8983
> > >
> > > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> > caught
> > > when connecting to {}->http://server2:8983: Too many open files
> > >
> > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> http://server2:8983
> > >
> > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> > caught
> > > when connecting to {}->http://server2:8983: Too many open files
> > >
> > > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> http://server2:8983
> > >
> > > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5983) [   ]
> > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> http://server2:8983
> > >
> > > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5980) [   ]
> > > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->
> 

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Susheel Kumar
Thanks, Nick. Do we know any suggested # for file descriptor limit with
Solr6?  Also wondering why i haven't seen this problem before with Solr 5.x?

On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev 
wrote:

> It looks like you have too many open files, try increasing the file
> descriptor limit.
>
> On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar 
> wrote:
>
> > Hello,
> >
> > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used
> the
> > install service to setup solr.
> >
> > After launching Solr Admin Panel on server1, it looses connections in few
> > seconds and then comes back and other node server2 is marked as Down in
> > cloud graph. After few seconds its loosing the connection and comes back.
> >
> > Any idea what may be going wrong? Has anyone used Solr 6 with ZK 3.4.8.
> > Have never seen this error before with solr 5.x with ZK 3.4.6.
> >
> > Below log from server1 & server2.  The ZK has 3 nodes with chroot
> enabled.
> >
> > Thanks,
> > Susheel
> >
> > server1/solr.log
> >
> > 
> >
> >
> > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> > [configName]=[collection1] specified config exists in ZooKeeper
> >
> > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
> o.a.s.s.HttpSolrCall
> > [admin] webapp=null path=/admin/collections
> > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25
> >
> > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
> > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> >
> > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
> o.a.s.s.HttpSolrCall
> > [admin] webapp=null path=/admin/collections
> > params={action=LIST=json&_=1462389588125} status=0 QTime=2
> >
> > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
> o.a.s.s.HttpSolrCall
> > [admin] webapp=null path=/admin/cores
> > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
> >
> > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
> o.a.s.s.HttpSolrCall
> > [admin] webapp=null path=/admin/info/system
> > params={wt=json&_=1462389588126} status=0 QTime=25
> >
> > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
> > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> >
> > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
> o.a.s.s.HttpSolrCall
> > [admin] webapp=null path=/admin/collections
> > params={action=LIST=json&_=1462389588125} status=0 QTime=3
> >
> > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5983) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5980) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.143 INFO  (qtp1989972246-5983) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open files
> >
> > 2016-05-04 19:21:29.144 INFO  (qtp1989972246-5983) [   ]
> > o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
> >
> > 2016-05-04 19:21:29.144 INFO  (qtp1989972246-5980) [   ]
> > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
> caught
> > when connecting to {}->http://server2:8983: Too many open 

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Nick Vasilyev
It looks like you have too many open files, try increasing the file
descriptor limit.

On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar  wrote:

> Hello,
>
> I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used the
> install service to setup solr.
>
> After launching Solr Admin Panel on server1, it looses connections in few
> seconds and then comes back and other node server2 is marked as Down in
> cloud graph. After few seconds its loosing the connection and comes back.
>
> Any idea what may be going wrong? Has anyone used Solr 6 with ZK 3.4.8.
> Have never seen this error before with solr 5.x with ZK 3.4.6.
>
> Below log from server1 & server2.  The ZK has 3 nodes with chroot enabled.
>
> Thanks,
> Susheel
>
> server1/solr.log
>
> 
>
>
> 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> [configName]=[collection1] specified config exists in ZooKeeper
>
> 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/collections
> params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25
>
> 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
> action=LIST=json&_=1462389588125 and sendToOCPQueue=true
>
> 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/collections
> params={action=LIST=json&_=1462389588125} status=0 QTime=2
>
> 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/cores
> params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
>
> 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/info/system
> params={wt=json&_=1462389588126} status=0 QTime=25
>
> 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
> action=LIST=json&_=1462389588125 and sendToOCPQueue=true
>
> 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/collections
> params={action=LIST=json&_=1462389588125} status=0 QTime=3
>
> 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5983) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.140 INFO  (qtp1989972246-5980) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.143 INFO  (qtp1989972246-5983) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.144 INFO  (qtp1989972246-5983) [   ]
> o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983
>
> 2016-05-04 19:21:29.144 INFO  (qtp1989972246-5980) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:21:29.144 INFO  (qtp1989972246-5983) [   ]
> o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
> when connecting to {}->http://server2:8983: Too many open files
>
> 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ] o.a.s.s.HttpSolrCall
> [admin] webapp=null path=/admin/collections
> params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25
>
> 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]

Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Susheel Kumar
Hello,

I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used the
install service to setup solr.

After launching Solr Admin Panel on server1, it looses connections in few
seconds and then comes back and other node server2 is marked as Down in
cloud graph. After few seconds its loosing the connection and comes back.

Any idea what may be going wrong? Has anyone used Solr 6 with ZK 3.4.8.
Have never seen this error before with solr 5.x with ZK 3.4.6.

Below log from server1 & server2.  The ZK has 3 nodes with chroot enabled.

Thanks,
Susheel

server1/solr.log




2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
o.a.s.c.c.ZkStateReader path=[/collections/collection1]
[configName]=[collection1] specified config exists in ZooKeeper

2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections
params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25

2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
action=LIST=json&_=1462389588125 and sendToOCPQueue=true

2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections
params={action=LIST=json&_=1462389588125} status=0 QTime=2

2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/cores
params={indexInfo=false=json&_=1462389588124} status=0 QTime=0

2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/info/system
params={wt=json&_=1462389588126} status=0 QTime=25

2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
action=LIST=json&_=1462389588125 and sendToOCPQueue=true

2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections
params={action=LIST=json&_=1462389588125} status=0 QTime=3

2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.141 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.142 INFO  (qtp1989972246-5984) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.140 INFO  (qtp1989972246-5983) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.140 INFO  (qtp1989972246-5980) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.143 INFO  (qtp1989972246-5983) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.144 INFO  (qtp1989972246-5983) [   ]
o.a.h.i.c.DefaultHttpClient Retrying connect to {}->http://server2:8983

2016-05-04 19:21:29.144 INFO  (qtp1989972246-5980) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:21:29.144 INFO  (qtp1989972246-5983) [   ]
o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException) caught
when connecting to {}->http://server2:8983: Too many open files

2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections
params={action=CLUSTERSTATUS=json&_=1462389588125} status=0 QTime=25

2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with params
action=LIST=json&_=1462389588125 and sendToOCPQueue=true

2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections
params={action=LIST=json&_=1462389588125} status=0 QTime=2

2016-05-04 19:20:57.520 INFO  

Faceting and Grouping Performance Degradation in Solr 5

2016-05-04 Thread Solr User
I recently was attempting to upgrade from Solr 4.8.1 to Solr 5.4.1 but had
to abort due to average response times degraded from a baseline volume
performance test.  The affected queries involved faceting (both enum method
and default) and grouping.  There is a critical bug
https://issues.apache.org/jira/browse/SOLR-8096 currently open which I
gather is the cause of the slower response times.  One concern I have is
that discussions around the issue offer the suggestion of indexing with
docValues which alleviated the problem in at least that one reported case.
However, indexing with docValues did not improve the performance in my case.

Can someone please confirm or correct my understanding that this issue has
no path forward at this time and specifically that it is already known that
docValues does not necessarily solve this?

Thanks in advance!


Re: Integrating grobid with Tika in solr

2016-05-04 Thread Shawn Heisey
On 5/4/2016 9:21 AM, Betsey Benagh wrote:
> I’m feeling particularly dense, because I don’t see any Tika jars in
> WEB-INF/lib:

Oops. Sorry about that, I forgot that it's all contrib.  That's my
mistake, not yours.

The Tika jars are in contrib/extraction/lib, along with a very large
number of dependencies.

It turns out that I probably have no idea what I'm talking about.  I
cannot find any version 1.12 downloads on Tika's website that are
structured the same way as what's in our contrib directory, so I have no
idea how to actually do the manual upgrade.

I seem to remember hearing about people doing a Tika upgrade manually,
but I've got no idea how they did it.

Thanks,
Shawn



Bug in Solr 6 dynamic-fields?

2016-05-04 Thread Tech Id
Hi,

We are unable to resolve a problem with dynamic fields in Solr 6.
The question and details can be found on stack-overflow at
http://stackoverflow.com/questions/37014345/unable-to-add-new-dynamic-fields-in-solr-6-0/37018450#37018450

If its a real bug, then we can file a JIRA for the same.

Appreciate any help !
Thanks
TiD


Re: Integrating grobid with Tika in solr

2016-05-04 Thread Betsey Benagh
As a workaround, I’m trying to run Grobid on my files, and then import the
corresponding XML into Solr.

I don’t see any errors on the post:

bba0124$ bin/post -c lrdtest ~/software/grobid/out/021002_1.tei.xml
/Library/Java/JavaVirtualMachines/jdk1.8.0_71.jdk/Contents/Home/bin/java
-classpath /Users/bba0124/software/solr-5.5.0/dist/solr-core-5.5.0.jar
-Dauto=yes -Dc=lrdtest -Ddata=files org.apache.solr.util.SimplePostTool
/Users/bba0124/software/grobid/out/021002_1.tei.xml
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/lrdtest/update...
Entering auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,r
tf,htm,html,txt,log
POSTing file 021002_1.tei.xml (application/xml) to [base]
1 files indexed.
COMMITting Solr index changes to
http://localhost:8983/solr/lrdtest/update...
Time spent: 0:00:00.027

But the documents don’t seem to show up in the index, either.


Additionally, if I try uploading the documents using the web UI, they
appear to upload successfully,

Response:{
  "responseHeader": {
"status": 0,
"QTime": 7
  }
}


But aren’t in the index.

What am I missing?

On 5/4/16, 10:55 AM, "Shawn Heisey"  wrote:

>On 5/4/2016 8:38 AM, Betsey Benagh wrote:
>> Thanks, I¹m currently using 5.5, and will try upgrading to 6.0.
>>
>>
>> On 5/4/16, 10:37 AM, "Allison, Timothy B."  wrote:
>>> Y. Solr 6.0.0 is shipping with Tika 1.7.  Grobid came in with Tika
>>>1.11.
>
>Just upgrading to 6.0.0 isn't enough.  As Tim said, Solr 6 currently
>uses Tika 1.7, but 1.11 is required.  That's four minor versions behind
>the minimum.
>
>Tim has filed an issue for upgrading Tika to 1.13 in Solr, which he did
>mention in a previous reply, but I do not know when it will be
>available.  Tim might have a better idea.
>
>https://issues.apache.org/jira/browse/SOLR-8981
>
>You might be able to upgrade Tika in your Solr install to 1.12 yourself
>by simply replacing the jar in WEB-INF/lib ... but I do not know whether
>this will cause any other problems.  Historically, replacing the jar has
>been a safe option ... but I can't guarantee that this will always be
>the case.
>
>Thanks,
>Shawn
>



Re: What does the "Max Doc" means in Admin interface?

2016-05-04 Thread John Bickerstaff
Max doc is the total amount of documents in the collection INCLUDING the
ones that have been deleted but not actually removed.  Don't worry, deleted
docs are not used in search results.

Yes, you can change the number by "optimizing" (see the button) but this
does take time and bandwidth so use it in a way that won't negatively
affect Production.  Right after the optimize the Num Docs and Max Docs
should be the same I believe.

The -1 is (as I learned in an email on this forum a day or three ago) a
sign of a bug and should be ignored for now.

On Mon, May 2, 2016 at 12:25 AM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:

> Hi All,
>
> Everything is in the title...
>
>
> Can this value be modified?
> Or is it because of my environment?
>
> Also, what does "Heap Memory Usage: -1" mean?
>
> Kind regards,
> Bastien Latard
> Web engineer
> --
> MDPI AG
> Postfach, CH-4005 Basel, Switzerland
> Office: Klybeckstrasse 64, CH-4057
> Tel. +41 61 683 77 35
> Fax: +41 61 302 89 18
> E-mail: latard@mdpi.comhttp://www.mdpi.com/
>
>


ReversedWildcardFilterFactory question

2016-05-04 Thread Susheel Kumar
Hello,

I wanted to confirm that using below type for fields where user *may
also* search
for leading wildcard, is a good solution and edismax query parser would
automatically reverse the query string in case of leading wildcard search
e.g. q:"text:*plane" would automatically be reversed by edismax query
parser to search for plane* ?

Thanks,
Susheel


  
  
  
  
  
  
  
  
  
  
  
  
  
  
  


Re: SOLR edismax and mm request parameter

2016-05-04 Thread ND
If I am understanding you correctly, it sounds like you are looking for an
phrase match with a really large query slop parameter (qs,
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Theqs%28QueryPhraseSlop%29Parameter),
I believe the old way of doing this would be something like passing
something like "Find This Phrase" ~100 to effectively the search to
return "Find" and "This" and "Phrase" within a million words (or
effectively you can pass a qs param of 100) but you would have to send
a quoted query string to use qs (not well documented in new Wiki but the
old one stats as much.

Hope this helps

Nick

On Wed, May 4, 2016 at 3:18 AM, Mark Robinson 
wrote:

> Thanks for the mail Jaques.
> I have a doubt here.
>
> When we use q.op=AND what I understood is, ALL query terms should be
> present any where across the various "qf" fields ie all of the query terms
> need not be present in one single field, but just need to be present for
> sure among the various qf fields.
>
> My requirement is I need *ALL query terms* to be present in at least *any
> one of the qf fields* for a doc to qualify.
> So not sure whether q.op=AND will help me.
>
> Pls correct me if I am missing something.
>
> Thanks!
> Mark.
>
>
>
> On Wed, May 4, 2016 at 5:57 AM, Jacques du Rand 
> wrote:
>
> > Sorry I meant "Ahmet Arslan" answer :)
> >
> >
> > On 4 May 2016 at 11:56, Jacques du Rand 
> wrote:
> >
> > > Although Mark Robinson's answer is correct you are now using the DISMAX
> > > not the Edismax parser...
> > > You can also play around with changing  q.op parameter  to 'AND'
> > >
> > >
> > >
> > > On 4 May 2016 at 11:40, Mark Robinson  wrote:
> > >
> > >> Thanks much Ahmet!
> > >>
> > >> I will try that out.
> > >>
> > >> Best,
> > >> Mark
> > >>
> > >> On Tue, May 3, 2016 at 11:53 PM, Ahmet Arslan
>  > >
> > >> wrote:
> > >>
> > >> > Hi Mark,
> > >> >
> > >> > You could do something like this:
> > >> >
> > >> > _query_:{!dismax qf='field1' mm='100%' v=$qq}
> > >> > OR
> > >> > _query_:{!dismax qf='field2' mm='100%' v=$qq}
> > >> > OR
> > >> > _query_:{!dismax qf='field3' mm='100%' v=$qq}
> > >> >
> > >> >
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> > >> >
> > >> > Ahmet
> > >> >
> > >> >
> > >> >
> > >> > On Wednesday, May 4, 2016 4:59 AM, Mark Robinson <
> > >> mark123lea...@gmail.com>
> > >> > wrote:
> > >> > Hi,
> > >> > On further checking cld identify that *blue *is indeed appearing in
> > one
> > >> of
> > >> > the qf fields.My bad!
> > >> >
> > >> > Cld someone pls help me with the 2nd question.
> > >> >
> > >> > Thanks!
> > >> > Mark.
> > >> >
> > >> >
> > >> >
> > >> > On Tue, May 3, 2016 at 8:03 PM, Mark Robinson <
> > mark123lea...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > I made a typo err in the prev mail for my first question when I
> > listed
> > >> > the
> > >> > > query terms.
> > >> > > Let me re-type both questions here once again pls.
> > >> > > Sorry for any inconvenience.
> > >> > >
> > >> > > 1.
> > >> > > My understanding of the mm parameter related to edismax is that,
> > >> > > if mm=100%,  only if ALL my query terms appear across any of the
> qf
> > >> > fields
> > >> > > will I get back
> > >> > > documents ... ie all the terms *need not be present in one single
> > >> field*
> > >> > ..
> > >> > > they just need to be present across any of the fields in my qf
> list.
> > >> > >
> > >> > > But my query for  the terms:-
> > >> > > *blue stainless washer*
> > >> > > ... returns a document which has *Stainless Washer *in one of my
> qf
> > >> > > fields, but *blue *is not there in any of the qf fields. Then how
> > did
> > >> it
> > >> > > get returned even though I had given mm=100% (100%25 when I typed
> > >> > directly
> > >> > > in browser). Any suggestions please.. In fact this is my first
> > record!
> > >> > >
> > >> > > 2.
> > >> > > Another question I have is:-
> > >> > > With edismax can I enforce that all my query terms should appear
> in
> > >> ANY
> > >> > of
> > >> > > my qf fields to qualify as a result document? I know all terms
> > >> appearing
> > >> > in
> > >> > > a single field can give a boost if we use the "pf" query parameter
> > >> > > accordingly. But how can I insist that to qualify as a result, the
> > doc
> > >> > > should have ALL of my query term in one or more of the qf fields?
> > >> > >
> > >> > >
> > >> > > Cld some one pls help.
> > >> > >
> > >> > > Thanks!
> > >> > >
> > >> > > Mark
> > >> > >
> > >> > > On Tue, May 3, 2016 at 6:28 PM, Mark Robinson <
> > >> mark123lea...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > >> Hi,
> > >> > >>
> > >> > >> 1.
> > >> > >> My understanding of the mm parameter related to edismax is that,
> > >> > >> if mm=100%,  only if ALL my query terms appear across 

Re: Integrating grobid with Tika in solr

2016-05-04 Thread Betsey Benagh
I’m feeling particularly dense, because I don’t see any Tika jars in
WEB-INF/lib:
antlr4-runtime-4.5.1-1.jar
 asm-5.0.4.jar
 asm-commons-5.0.4.jar
 commons-cli-1.2.jar
 commons-codec-1.10.jar
 commons-collections-3.2.2.jar
 commons-configuration-1.6.jar
 commons-exec-1.3.jar
 commons-fileupload-1.2.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
 concurrentlinkedhashmap-lru-1.2.jar
 dom4j-1.6.1.jar
 guava-14.0.1.jar
 hadoop-annotations-2.6.0.jar
 hadoop-auth-2.6.0.jar
 hadoop-common-2.6.0.jar
 hadoop-hdfs-2.6.0.jar
 hppc-0.7.1.jar
 htrace-core-3.0.4.jar
 httpclient-4.4.1.jar
 httpcore-4.4.1.jar
 httpmime-4.4.1.jar
 jackson-core-2.5.4.jar
 jackson-dataformat-smile-2.5.4.jar
 joda-time-2.2.jar
 listing.txt
 lucene-analyzers-common-5.5.0.jar
 lucene-analyzers-kuromoji-5.5.0.jar
 lucene-analyzers-phonetic-5.5.0.jar
 lucene-backward-codecs-5.5.0.jar
 lucene-codecs-5.5.0.jar
 lucene-core-5.5.0.jar
 lucene-expressions-5.5.0.jar
 lucene-grouping-5.5.0.jar
 lucene-highlighter-5.5.0.jar
 lucene-join-5.5.0.jar
 lucene-memory-5.5.0.jar
 lucene-misc-5.5.0.jar
 lucene-queries-5.5.0.jar
 lucene-queryparser-5.5.0.jar
 lucene-sandbox-5.5.0.jar
 lucene-spatial-5.5.0.jar
 lucene-suggest-5.5.0.jar
 noggit-0.6.jar
 org.restlet-2.3.0.jar
 org.restlet.ext.servlet-2.3.0.jar
 protobuf-java-2.5.0.jar
 solr-core-5.5.0.jar
 solr-solrj-5.5.0.jar
 spatial4j-0.5.jar
 stax2-api-3.1.4.jar
 t-digest-3.1.jar
 woodstox-core-asl-4.4.1.jar
 zookeeper-3.4.6.jar










On 5/4/16, 10:55 AM, "Shawn Heisey"  wrote:

>On 5/4/2016 8:38 AM, Betsey Benagh wrote:
>> Thanks, I¹m currently using 5.5, and will try upgrading to 6.0.
>>
>>
>> On 5/4/16, 10:37 AM, "Allison, Timothy B."  wrote:
>>> Y. Solr 6.0.0 is shipping with Tika 1.7.  Grobid came in with Tika
>>>1.11.
>
>Just upgrading to 6.0.0 isn't enough.  As Tim said, Solr 6 currently
>uses Tika 1.7, but 1.11 is required.  That's four minor versions behind
>the minimum.
>
>Tim has filed an issue for upgrading Tika to 1.13 in Solr, which he did
>mention in a previous reply, but I do not know when it will be
>available.  Tim might have a better idea.
>
>https://issues.apache.org/jira/browse/SOLR-8981
>
>You might be able to upgrade Tika in your Solr install to 1.12 yourself
>by simply replacing the jar in WEB-INF/lib ... but I do not know whether
>this will cause any other problems.  Historically, replacing the jar has
>been a safe option ... but I can't guarantee that this will always be
>the case.
>
>Thanks,
>Shawn
>



Re: Integrating grobid with Tika in solr

2016-05-04 Thread Shawn Heisey
On 5/4/2016 8:38 AM, Betsey Benagh wrote:
> Thanks, I¹m currently using 5.5, and will try upgrading to 6.0.
>
>
> On 5/4/16, 10:37 AM, "Allison, Timothy B."  wrote:
>> Y. Solr 6.0.0 is shipping with Tika 1.7.  Grobid came in with Tika 1.11.

Just upgrading to 6.0.0 isn't enough.  As Tim said, Solr 6 currently
uses Tika 1.7, but 1.11 is required.  That's four minor versions behind
the minimum.

Tim has filed an issue for upgrading Tika to 1.13 in Solr, which he did
mention in a previous reply, but I do not know when it will be
available.  Tim might have a better idea.

https://issues.apache.org/jira/browse/SOLR-8981

You might be able to upgrade Tika in your Solr install to 1.12 yourself
by simply replacing the jar in WEB-INF/lib ... but I do not know whether
this will cause any other problems.  Historically, replacing the jar has
been a safe option ... but I can't guarantee that this will always be
the case.

Thanks,
Shawn



Re: Integrating grobid with Tika in solr

2016-05-04 Thread Betsey Benagh
Thanks, I¹m currently using 5.5, and will try upgrading to 6.0.


On 5/4/16, 10:37 AM, "Allison, Timothy B."  wrote:

>Y. Solr 6.0.0 is shipping with Tika 1.7.  Grobid came in with Tika 1.11.
>
>-Original Message-
>From: Allison, Timothy B. [mailto:talli...@mitre.org]
>Sent: Wednesday, May 4, 2016 10:29 AM
>To: solr-user@lucene.apache.org
>Subject: RE: Integrating grobid with Tika in solr
>
>I think Solr is using a version of Tika that predates that addition of
>the Grobid parser.  You'll have to add that manually somehow until Solr
>upgrades to Tika 1.13 (soon to be released...I think).  SOLR-8981.
>
>-Original Message-
>From: Betsey Benagh [mailto:betsey.ben...@stresearch.com]
>Sent: Wednesday, May 4, 2016 10:07 AM
>To: solr-user@lucene.apache.org
>Subject: Re: Integrating grobid with Tika in solr
>
>Grobid runs as a service, and I'm (theoretically) configuring Tika to
>call it.
>
>From the Grobid wiki, here are instructions for integrating with Tika
>application:
>
>First we need to create the GrobidExtractor.properties file that points
>to the Grobid REST Service. My file looks like the following:
>
>grobid.server.url=http://localhost:[port]
>
>Now you can run GROBID via Tika-app with the following command on a
>sample PDF file.
>
>java -classpath 
>$HOME/src/grobidparser-resources/:tika-app-1.11-SNAPSHOT.jar
>org.apache.tika.cli.TikaCLI
>--config=$HOME/src/grobidparser-resources/tika-config.xml -J
>$HOME/src/grobid/papers/ICSE06.pdf
>
>Here's the stack trace.
>
>name="error-class">org.apache.solr.common.SolrExceptionname="root-error-class">java.lang.ClassNotFoundExceptionname="msg">org.apache.tika.exception.TikaException: Unable to find a
>parser class: org.apache.tika.parser.journal.JournalParsername="trace">org.apache.solr.common.SolrException:
>org.apache.tika.exception.TikaException: Unable to find a parser class:
>org.apache.tika.parser.journal.JournalParser
>at 
>org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(Extract
>ingRequestHandler.java:82)
>at 
>org.apache.solr.core.PluginBag$LazyPluginHolder.createInst(PluginBag.java:
>367)
>at org.apache.solr.core.PluginBag$LazyPluginHolder.get(PluginBag.java:348)
>at org.apache.solr.core.PluginBag.get(PluginBag.java:148)
>at 
>org.apache.solr.handler.RequestHandlerBase.getRequestHandler(RequestHandle
>rBase.java:231)
>at org.apache.solr.core.SolrCore.getRequestHandler(SolrCore.java:1362)
>at 
>org.apache.solr.servlet.HttpSolrCall.extractHandlerFromURLPath(HttpSolrCal
>l.java:326)
>at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:296)
>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:412)
>at 
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:225)
>at 
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:183)
>at 
>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
>er.java:1652)
>at 
>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>at 
>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>43)
>at 
>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577
>)
>at 
>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.ja
>va:223)
>at 
>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.ja
>va:1127)
>at 
>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>at 
>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.jav
>a:185)
>at 
>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.jav
>a:1061)
>at 
>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>41)
>at 
>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHa
>ndlerCollection.java:215)
>at 
>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollectio
>n.java:110)
>at 
>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java
>:97)
>at org.eclipse.jetty.server.Server.handle(Server.java:499)
>at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>at 
>org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257
>)
>at 
>org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>at 
>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.jav
>a:635)
>at 
>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java
>:555)
>at java.lang.Thread.run(Thread.java:745)
>Caused by: org.apache.tika.exception.TikaException: Unable to find a
>parser class: org.apache.tika.parser.journal.JournalParser
>at 
>org.apache.tika.config.TikaConfig.parserFromDomElement(TikaConfig.java:362
>)
>at org.apache.tika.config.TikaConfig.init(TikaConfig.java:127)
>at org.apache.tika.config.TikaConfig.init(TikaConfig.java:115)
>at org.apache.tika.config.TikaConfig.init(TikaConfig.java:111)
>at org.apache.tika.config.TikaConfig.init(TikaConfig.java:92)
>at 

RE: Integrating grobid with Tika in solr

2016-05-04 Thread Allison, Timothy B.
Y. Solr 6.0.0 is shipping with Tika 1.7.  Grobid came in with Tika 1.11.

-Original Message-
From: Allison, Timothy B. [mailto:talli...@mitre.org] 
Sent: Wednesday, May 4, 2016 10:29 AM
To: solr-user@lucene.apache.org
Subject: RE: Integrating grobid with Tika in solr

I think Solr is using a version of Tika that predates that addition of the 
Grobid parser.  You'll have to add that manually somehow until Solr upgrades to 
Tika 1.13 (soon to be released...I think).  SOLR-8981.

-Original Message-
From: Betsey Benagh [mailto:betsey.ben...@stresearch.com] 
Sent: Wednesday, May 4, 2016 10:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Integrating grobid with Tika in solr

Grobid runs as a service, and I'm (theoretically) configuring Tika to call it.

>From the Grobid wiki, here are instructions for integrating with Tika 
>application:

First we need to create the GrobidExtractor.properties file that points to the 
Grobid REST Service. My file looks like the following:

grobid.server.url=http://localhost:[port]

Now you can run GROBID via Tika-app with the following command on a sample PDF 
file.

java -classpath $HOME/src/grobidparser-resources/:tika-app-1.11-SNAPSHOT.jar 
org.apache.tika.cli.TikaCLI 
--config=$HOME/src/grobidparser-resources/tika-config.xml -J 
$HOME/src/grobid/papers/ICSE06.pdf

Here's the stack trace.

org.apache.solr.common.SolrExceptionjava.lang.ClassNotFoundExceptionorg.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParserorg.apache.solr.common.SolrException: 
org.apache.tika.exception.TikaException: Unable to find a parser class: 
org.apache.tika.parser.journal.JournalParser
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:82)
at 
org.apache.solr.core.PluginBag$LazyPluginHolder.createInst(PluginBag.java:367)
at org.apache.solr.core.PluginBag$LazyPluginHolder.get(PluginBag.java:348)
at org.apache.solr.core.PluginBag.get(PluginBag.java:148)
at 
org.apache.solr.handler.RequestHandlerBase.getRequestHandler(RequestHandlerBase.java:231)
at org.apache.solr.core.SolrCore.getRequestHandler(SolrCore.java:1362)
at 
org.apache.solr.servlet.HttpSolrCall.extractHandlerFromURLPath(HttpSolrCall.java:326)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:296)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:412)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParser
at org.apache.tika.config.TikaConfig.parserFromDomElement(TikaConfig.java:362)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:127)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:115)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:111)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:92)
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:80)
... 30 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.parser.journal.JournalParser
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at 

RE: Integrating grobid with Tika in solr

2016-05-04 Thread Allison, Timothy B.
I think Solr is using a version of Tika that predates that addition of the 
Grobid parser.  You'll have to add that manually somehow until Solr upgrades to 
Tika 1.13 (soon to be released...I think).  SOLR-8981.

-Original Message-
From: Betsey Benagh [mailto:betsey.ben...@stresearch.com] 
Sent: Wednesday, May 4, 2016 10:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Integrating grobid with Tika in solr

Grobid runs as a service, and I'm (theoretically) configuring Tika to call it.

>From the Grobid wiki, here are instructions for integrating with Tika 
>application:

First we need to create the GrobidExtractor.properties file that points to the 
Grobid REST Service. My file looks like the following:

grobid.server.url=http://localhost:[port]

Now you can run GROBID via Tika-app with the following command on a sample PDF 
file.

java -classpath $HOME/src/grobidparser-resources/:tika-app-1.11-SNAPSHOT.jar 
org.apache.tika.cli.TikaCLI 
--config=$HOME/src/grobidparser-resources/tika-config.xml -J 
$HOME/src/grobid/papers/ICSE06.pdf

Here's the stack trace.

org.apache.solr.common.SolrExceptionjava.lang.ClassNotFoundExceptionorg.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParserorg.apache.solr.common.SolrException: 
org.apache.tika.exception.TikaException: Unable to find a parser class: 
org.apache.tika.parser.journal.JournalParser
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:82)
at 
org.apache.solr.core.PluginBag$LazyPluginHolder.createInst(PluginBag.java:367)
at org.apache.solr.core.PluginBag$LazyPluginHolder.get(PluginBag.java:348)
at org.apache.solr.core.PluginBag.get(PluginBag.java:148)
at 
org.apache.solr.handler.RequestHandlerBase.getRequestHandler(RequestHandlerBase.java:231)
at org.apache.solr.core.SolrCore.getRequestHandler(SolrCore.java:1362)
at 
org.apache.solr.servlet.HttpSolrCall.extractHandlerFromURLPath(HttpSolrCall.java:326)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:296)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:412)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParser
at org.apache.tika.config.TikaConfig.parserFromDomElement(TikaConfig.java:362)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:127)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:115)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:111)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:92)
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:80)
... 30 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.parser.journal.JournalParser
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method) at 
java.lang.Class.forName(Class.java:348)
at org.apache.tika.config.ServiceLoader.getServiceClass(ServiceLoader.java:189)
at 

Re: OOM script executed

2016-05-04 Thread Shawn Heisey
On 5/3/2016 11:58 PM, Bastien Latard - MDPI AG wrote:
> Thank you for your email.
> You said "have big caches or request big pages (e.g. 100k docs)"...
> Does a fq cache all the potential results, or only the ones the query
> returns?
> e.g.: select?q=*:*=bPublic:true=10
>
> => with this query, if I have 60 millions of public documents, would
> it cache 10 or 60 millions of IDs?
> ...and does it cache it the filter cache (from fq) in the OS cache or
> in java heap? 

The result of a filter query is a bitset.  If the core contains 60
million documents, each bitset is 7.5 million bytes in length.  It is
not a list of IDs -- it's a large array of bits representing every
document in the Lucene index, including deleted documents (the Max Doc
value from the core overview).  There are two values for each bit - 0 or
1, depending on whether each document matches the filter or not.

Thanks,
Shawn



Re: solr.ICUCollationField class on cloudera search solr

2016-05-04 Thread Shawn Heisey
On 5/4/2016 3:53 AM, tkg_cangkul wrote:
> i have check the library
>>
>> /opt/cloudera/parcels/CDH/lib/solr/solr-analysis-extras-4.10.3-cdh5.7.0.jar
>>and there is solr.ICUCollationField class. but why i still have this
>>error message?
>> pls help


You may be able to get this question answered much faster and easier by
asking Cloudera for help.  They know how they have customized Solr ...
we don't.  They will probably know exactly what to change ... for us to
help, we will need to ask for a lot of information that Cloudera will
probably already know without asking.


The way I would suggest handling extra jars is to remove all 
elements from solrconfig.xml and put those jars in ${solr.solr.home}/lib
... but this may not becompatible with the way that Cloudera does things.


Thanks,
Shawn



Re: Integrating grobid with Tika in solr

2016-05-04 Thread Betsey Benagh
Grobid runs as a service, and I’m (theoretically) configuring Tika to call it.

>From the Grobid wiki, here are instructions for integrating with Tika 
>application:

First we need to create the GrobidExtractor.properties file that points to the 
Grobid REST Service. My file looks like the following:

grobid.server.url=http://localhost:[port]

Now you can run GROBID via Tika-app with the following command on a sample PDF 
file.

java -classpath $HOME/src/grobidparser-resources/:tika-app-1.11-SNAPSHOT.jar 
org.apache.tika.cli.TikaCLI 
--config=$HOME/src/grobidparser-resources/tika-config.xml -J 
$HOME/src/grobid/papers/ICSE06.pdf

Here’s the stack trace.

org.apache.solr.common.SolrExceptionjava.lang.ClassNotFoundExceptionorg.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParserorg.apache.solr.common.SolrException: 
org.apache.tika.exception.TikaException: Unable to find a parser class: 
org.apache.tika.parser.journal.JournalParser
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:82)
at 
org.apache.solr.core.PluginBag$LazyPluginHolder.createInst(PluginBag.java:367)
at org.apache.solr.core.PluginBag$LazyPluginHolder.get(PluginBag.java:348)
at org.apache.solr.core.PluginBag.get(PluginBag.java:148)
at 
org.apache.solr.handler.RequestHandlerBase.getRequestHandler(RequestHandlerBase.java:231)
at org.apache.solr.core.SolrCore.getRequestHandler(SolrCore.java:1362)
at 
org.apache.solr.servlet.HttpSolrCall.extractHandlerFromURLPath(HttpSolrCall.java:326)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:296)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:412)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.exception.TikaException: Unable to find a parser 
class: org.apache.tika.parser.journal.JournalParser
at org.apache.tika.config.TikaConfig.parserFromDomElement(TikaConfig.java:362)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:127)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:115)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:111)
at org.apache.tika.config.TikaConfig.init(TikaConfig.java:92)
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:80)
... 30 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.parser.journal.JournalParser
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.tika.config.ServiceLoader.getServiceClass(ServiceLoader.java:189)
at org.apache.tika.config.TikaConfig.parserFromDomElement(TikaConfig.java:338)
... 35 more
500



On 5/4/16, 10:00 AM, "Shawn Heisey" 
> wrote:

On 5/4/2016 7:15 AM, Betsey Benagh wrote:
(X-posted from stack overflow)
This feels like a basic, dumb question, but my reading of the documentation has 
not led me to an answer.
i'm using Solr to index journal articles. Using the 

Re: Integrating grobid with Tika in solr

2016-05-04 Thread Shawn Heisey
On 5/4/2016 7:15 AM, Betsey Benagh wrote:
> (X-posted from stack overflow)
> 
> This feels like a basic, dumb question, but my reading of the documentation 
> has not led me to an answer.
> 
> 
> i'm using Solr to index journal articles. Using the out-of-the-box 
> configuration, it indexed the text of the documents, but I'm looking to use 
> Grobid to pull out the authors, title, affiliations, etc. I got grobid up and 
> running as a service.
> 
> I added
> 
> /path/to/tika-config.xml
> 
> to the requestHandler for /update/extract in solrconfig.xml
> 
> The tika-config looks like:
> 
> 
> 
>   
> 
>   application/pdf
> 
>   
> 
> 
> 
> I'm getting a ClassNotFound exception when I try to import a document, but 
> can't figure out where to set the classpath to fix it.

I do not know anything about grobid.

We'll need to see the exception -- the entire multi-line stacktrace,
including any "caused by" sections.

In general, you should create a lib directory in the solr home and place
all extra jars in that directory.  Otherwise you need  elements in
solrconfig.xml to load jars -- and they will be loaded once for every
core that uses that  element.  ${solr.solr.home}/lib loads jars
*once* when Solr starts and makes them available to all cores.

Thanks,
Shawn



Nodes appear twice in state.json

2016-05-04 Thread Markus Jelsma
Hi - we've just upgraded a development environment from 5.5 to Solr 6.0. After 
the upgrade, which went fine, we see two replica's appear twice in the cloud 
view (see below), both being leader. We've seen this happen before on some 
older 5.x versions. Is there a Jira issue i am missing? An unknown issue?

Also, how to fix this. How do we remove the double node from the state.json?

Many thanks!
Markus

{"search":{
"replicationFactor":"3",
"shards":{
  "shard1":{
"range":"8000-",
"state":"active",
"replicas":{
  "core_node6":{
"core":"search_shard1_replica1",
"base_url":"http://idx5.oi.dev:8983/solr;,
"node_name":"idx5.oi.dev:8983_solr",
"state":"down"},
  "core_node2":{
"core":"search_shard1_replica2",
"base_url":"http://idx2.oi.dev:8983/solr;,
"node_name":"idx2.oi.dev:8983_solr",
"state":"active",
"leader":"true"},
  "core_node3":{
"core":"search_shard1_replica2",
"base_url":"http://idx2.oi.dev:8983/solr;,
"node_name":"idx2.oi.dev:8983_solr",
"state":"down",
"leader":"true"},
  "core_node5":{
"core":"search_shard1_replica3",
"base_url":"http://idx3.oi.dev:8983/solr;,
"node_name":"idx3.oi.dev:8983_solr",
"state":"down"}}},
  "shard2":{
"range":"0-7fff",
"state":"active",
"replicas":{
  "core_node1":{
"core":"search_shard2_replica1",
"base_url":"http://idx4.oi.dev:8983/solr;,
"node_name":"idx4.oi.dev:8983_solr",
"state":"down"},
  "core_node2":{
"core":"search_shard2_replica2",
"base_url":"http://idx6.oi.dev:8983/solr;,
"node_name":"idx6.oi.dev:8983_solr",
"state":"down"},
  "core_node4":{
"core":"search_shard2_replica3",
"base_url":"http://idx1.oi.dev:8983/solr;,
"node_name":"idx1.oi.dev:8983_solr",
"state":"active",
"leader":"true",
"router":{"name":"compositeId"},
"maxShardsPerNode":"1",
"autoAddReplicas":"false"}}

 



Integrating grobid with Tika in solr

2016-05-04 Thread Betsey Benagh
(X-posted from stack overflow)

This feels like a basic, dumb question, but my reading of the documentation has 
not led me to an answer.


i'm using Solr to index journal articles. Using the out-of-the-box 
configuration, it indexed the text of the documents, but I'm looking to use 
Grobid to pull out the authors, title, affiliations, etc. I got grobid up and 
running as a service.

I added

/path/to/tika-config.xml

to the requestHandler for /update/extract in solrconfig.xml

The tika-config looks like:



  

  application/pdf

  



I'm getting a ClassNotFound exception when I try to import a document, but 
can't figure out where to set the classpath to fix it.


Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-04 Thread Kevin Risden
>
> java.sql.SQLException: java.lang.RuntimeException: First tuple is not a
> metadata tuple
>

That is a client side error message meaning that the statement couldn't be
handled. There should be better error handling around this, but its not in
place currently.

And on Solr side, the logs seem okay:


The logs you shared don't seem to be the full logs. There will be a related
exception on the Solr server side. The exception on the Solr server side
will explain the cause of the problem.

Kevin Risden

On Wed, May 4, 2016 at 2:57 AM, deniz  wrote:

> I am trying to go through the steps  here
> 
> to start playing with the new api, but I am getting:
>
> java.sql.SQLException: java.lang.RuntimeException: First tuple is not a
> metadata tuple
> at
>
> org.apache.solr.client.solrj.io.sql.StatementImpl.executeQuery(StatementImpl.java:70)
> at com.sematext.blog.App.main(App.java:28)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
> Caused by: java.lang.RuntimeException: First tuple is not a metadata tuple
> at
>
> org.apache.solr.client.solrj.io.sql.ResultSetImpl.(ResultSetImpl.java:75)
> at
>
> org.apache.solr.client.solrj.io.sql.StatementImpl.executeQuery(StatementImpl.java:67)
> ... 6 more
>
>
>
> My code is
>
> import java.sql.Connection;
> import java.sql.DriverManager;
> import java.sql.ResultSet;
> import java.sql.SQLException;
> import java.sql.Statement;
>
>
> /**
>  * Hello world!
>  *
>  */
> public class App
> {
> public static void main( String[] args )
> {
>
>
> Connection connection = null;
> Statement statement = null;
> ResultSet resultSet = null;
>
> try{
> String connectionString =
>
> "jdbc:solr://zkhost:port?collection=test=map_reduce=1";
> connection = DriverManager.getConnection(connectionString);
> statement  = connection.createStatement();
> resultSet = statement.executeQuery("select id, text from test
> where tits=1 limit 5");
> while(resultSet.next()){
> String id = resultSet.getString("id");
> String nickname = resultSet.getString("text");
>
> System.out.println(id + " : " + nickname);
> }
> }catch(Exception e){
> e.printStackTrace();
> }finally{
> if (resultSet != null) {
> try {
> resultSet.close();
> } catch (Exception ex) {
> }
> }
> if (statement != null) {
> try {
> statement.close();
> } catch (Exception ex) {
> }
> }
> if (connection != null) {
> try {
> connection.close();
> } catch (Exception ex) {
> }
> }
> }
>
>
> }
> }
>
>
> I tried to figure out what is happening, but there is no more logs other
> than the one above. And on Solr side, the logs seem okay:
>
> 2016-05-04 15:52:30.364 INFO  (qtp1634198-41) [c:test s:shard1 r:core_node1
> x:test] o.a.s.c.S.Request [test]  webapp=/solr path=/sql
>
> params={includeMetadata=true=1=json=2.2=select+id,+text+from+test+where+tits%3D1+limit+5=map_reduce}
> status=0 QTime=3
> 2016-05-04 15:52:30.382 INFO  (qtp1634198-46) [c:test s:shard1 r:core_node1
> x:test] o.a.s.c.S.Request [test]  webapp=/solr path=/select
>
> params={q=(tits:"1")=false=id,text,score=score+desc=5=json=2.2}
> hits=5624 status=0 QTime=1
>
>
> The error is happening because of some missing handlers on errors on the
> code or because of some strict checks on IDE(Ideaj)? Anyone had similar
> issues while using sql with solrj?
>
>
> Thanks
>
> Deniz
>
>
>
> -
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


RE: Migrating from Solr 5.4 to Solr 6.0

2016-05-04 Thread Markus Jelsma
No, you don't need to reindex.
M. 
 
-Original message-
> From:Zheng Lin Edwin Yeo 
> Sent: Wednesday 4th May 2016 13:27
> To: solr-user@lucene.apache.org
> Subject: Migrating from Solr 5.4 to Solr 6.0
> 
> Hi,
> 
> Would like to find out, do we need to re-index our document when we migrate
> from Solr 5.4 to Solr 6.0 because of the change in scoring algorithm to
> BM25?
> 
> Regards,
> Edwin
> 


MoreLikeThis (MLT) search

2016-05-04 Thread Zheng Lin Edwin Yeo
Hi,

Would like to find out, must the fieldType be indexing with string, before
we can perform a MoreLikeThis (MLT) query?

Currently, my indexes are indexed with the HMMChineseTokenizer, so will it
work well for MLT query?

Below is my configuration for the fileType which I'm planning to execute
the MLT query on.


 









 
 








  
  


Regards,
Edwin


Migrating from Solr 5.4 to Solr 6.0

2016-05-04 Thread Zheng Lin Edwin Yeo
Hi,

Would like to find out, do we need to re-index our document when we migrate
from Solr 5.4 to Solr 6.0 because of the change in scoring algorithm to
BM25?

Regards,
Edwin


Re: SOLR edismax and mm request parameter

2016-05-04 Thread Mark Robinson
Thanks for the mail Jaques.
I have a doubt here.

When we use q.op=AND what I understood is, ALL query terms should be
present any where across the various "qf" fields ie all of the query terms
need not be present in one single field, but just need to be present for
sure among the various qf fields.

My requirement is I need *ALL query terms* to be present in at least *any
one of the qf fields* for a doc to qualify.
So not sure whether q.op=AND will help me.

Pls correct me if I am missing something.

Thanks!
Mark.



On Wed, May 4, 2016 at 5:57 AM, Jacques du Rand 
wrote:

> Sorry I meant "Ahmet Arslan" answer :)
>
>
> On 4 May 2016 at 11:56, Jacques du Rand  wrote:
>
> > Although Mark Robinson's answer is correct you are now using the DISMAX
> > not the Edismax parser...
> > You can also play around with changing  q.op parameter  to 'AND'
> >
> >
> >
> > On 4 May 2016 at 11:40, Mark Robinson  wrote:
> >
> >> Thanks much Ahmet!
> >>
> >> I will try that out.
> >>
> >> Best,
> >> Mark
> >>
> >> On Tue, May 3, 2016 at 11:53 PM, Ahmet Arslan  >
> >> wrote:
> >>
> >> > Hi Mark,
> >> >
> >> > You could do something like this:
> >> >
> >> > _query_:{!dismax qf='field1' mm='100%' v=$qq}
> >> > OR
> >> > _query_:{!dismax qf='field2' mm='100%' v=$qq}
> >> > OR
> >> > _query_:{!dismax qf='field3' mm='100%' v=$qq}
> >> >
> >> >
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> >> >
> >> > Ahmet
> >> >
> >> >
> >> >
> >> > On Wednesday, May 4, 2016 4:59 AM, Mark Robinson <
> >> mark123lea...@gmail.com>
> >> > wrote:
> >> > Hi,
> >> > On further checking cld identify that *blue *is indeed appearing in
> one
> >> of
> >> > the qf fields.My bad!
> >> >
> >> > Cld someone pls help me with the 2nd question.
> >> >
> >> > Thanks!
> >> > Mark.
> >> >
> >> >
> >> >
> >> > On Tue, May 3, 2016 at 8:03 PM, Mark Robinson <
> mark123lea...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > I made a typo err in the prev mail for my first question when I
> listed
> >> > the
> >> > > query terms.
> >> > > Let me re-type both questions here once again pls.
> >> > > Sorry for any inconvenience.
> >> > >
> >> > > 1.
> >> > > My understanding of the mm parameter related to edismax is that,
> >> > > if mm=100%,  only if ALL my query terms appear across any of the qf
> >> > fields
> >> > > will I get back
> >> > > documents ... ie all the terms *need not be present in one single
> >> field*
> >> > ..
> >> > > they just need to be present across any of the fields in my qf list.
> >> > >
> >> > > But my query for  the terms:-
> >> > > *blue stainless washer*
> >> > > ... returns a document which has *Stainless Washer *in one of my qf
> >> > > fields, but *blue *is not there in any of the qf fields. Then how
> did
> >> it
> >> > > get returned even though I had given mm=100% (100%25 when I typed
> >> > directly
> >> > > in browser). Any suggestions please.. In fact this is my first
> record!
> >> > >
> >> > > 2.
> >> > > Another question I have is:-
> >> > > With edismax can I enforce that all my query terms should appear in
> >> ANY
> >> > of
> >> > > my qf fields to qualify as a result document? I know all terms
> >> appearing
> >> > in
> >> > > a single field can give a boost if we use the "pf" query parameter
> >> > > accordingly. But how can I insist that to qualify as a result, the
> doc
> >> > > should have ALL of my query term in one or more of the qf fields?
> >> > >
> >> > >
> >> > > Cld some one pls help.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > Mark
> >> > >
> >> > > On Tue, May 3, 2016 at 6:28 PM, Mark Robinson <
> >> mark123lea...@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> 1.
> >> > >> My understanding of the mm parameter related to edismax is that,
> >> > >> if mm=100%,  only if ALL my query terms appear across any of the qf
> >> > >> fields will I get back
> >> > >> documents ... ie all the terms *need not be present in one single
> >> field*
> >> > >> .. they just need to be present across any of the fields in my qf
> >> list.
> >> > >>
> >> > >> But my query for  the terms:-
> >> > >> *blue stainless washer*
> >> > >> ... returns a document which has *Stainless Washer *in one of my qf
> >> > >> fields, but *refrigerator *is not there in any of the qf fields.
> Then
> >> > >> how did it get returned even though I had given mm=100% (100%25
> when
> >> I
> >> > >> typed directly in browser). Any suggestions please.
> >> > >>
> >> > >> 2.
> >> > >> Another question I have is:-
> >> > >> With edismax can I enforce that all my query terms should appear in
> >> ANY
> >> > >> of my qf fields to qualify as a result document? I know all terms
> >> > appearing
> >> > >> in a single field can give a boost if we use the "pf" query
> parameter
> >> > >> accordingly. But how can I insist that to qualify as a result, the
> >> doc
> >> > >> 

Re: Results of facet differs with change in facet.limit.

2016-05-04 Thread Modassar Ather
The "val1" is same for both the test with limit 100 and 200 so the
following is true.

limit=100
1225
1082
1076

limit=200
1366
1321
1315

This I have noticed irrespective of facet.limit too. Please refer to my
previous mail for the example.

Thanks,
Modassar

On Wed, May 4, 2016 at 3:01 PM, Toke Eskildsen 
wrote:

> On Mon, 2016-05-02 at 15:53 +0530, Modassar Ather wrote:
> > E.g.
> > Query : text_field:term=f=100
> > Result :
> > 1225
> > 1082
> > 1076
> >
> > Query : text_field:term=f=200
> > 1366
> > 1321
> > 1315
>
> Is the "val1" in your limit=100 test the same term as your "val1" in
> your limit=200-test?
>
>
> Or to phrase it another way: Do you have
>
> limit=100
> 1225
> 1082
> 1076
>
> limit=200
> 1366
> 1321
> 1315
>
>
> or
>
> limit=100
> 1225
> 1082
> 1076
>
> limit=200
> 1366
> 1321
> 1315
>
>
> - Toke Eskildsen, State and University Library, Denmark
>
>


Re: SOLR edismax and mm request parameter

2016-05-04 Thread Jacques du Rand
Sorry I meant "Ahmet Arslan" answer :)


On 4 May 2016 at 11:56, Jacques du Rand  wrote:

> Although Mark Robinson's answer is correct you are now using the DISMAX
> not the Edismax parser...
> You can also play around with changing  q.op parameter  to 'AND'
>
>
>
> On 4 May 2016 at 11:40, Mark Robinson  wrote:
>
>> Thanks much Ahmet!
>>
>> I will try that out.
>>
>> Best,
>> Mark
>>
>> On Tue, May 3, 2016 at 11:53 PM, Ahmet Arslan 
>> wrote:
>>
>> > Hi Mark,
>> >
>> > You could do something like this:
>> >
>> > _query_:{!dismax qf='field1' mm='100%' v=$qq}
>> > OR
>> > _query_:{!dismax qf='field2' mm='100%' v=$qq}
>> > OR
>> > _query_:{!dismax qf='field3' mm='100%' v=$qq}
>> >
>> >
>> >
>> >
>> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
>> >
>> > Ahmet
>> >
>> >
>> >
>> > On Wednesday, May 4, 2016 4:59 AM, Mark Robinson <
>> mark123lea...@gmail.com>
>> > wrote:
>> > Hi,
>> > On further checking cld identify that *blue *is indeed appearing in one
>> of
>> > the qf fields.My bad!
>> >
>> > Cld someone pls help me with the 2nd question.
>> >
>> > Thanks!
>> > Mark.
>> >
>> >
>> >
>> > On Tue, May 3, 2016 at 8:03 PM, Mark Robinson 
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > I made a typo err in the prev mail for my first question when I listed
>> > the
>> > > query terms.
>> > > Let me re-type both questions here once again pls.
>> > > Sorry for any inconvenience.
>> > >
>> > > 1.
>> > > My understanding of the mm parameter related to edismax is that,
>> > > if mm=100%,  only if ALL my query terms appear across any of the qf
>> > fields
>> > > will I get back
>> > > documents ... ie all the terms *need not be present in one single
>> field*
>> > ..
>> > > they just need to be present across any of the fields in my qf list.
>> > >
>> > > But my query for  the terms:-
>> > > *blue stainless washer*
>> > > ... returns a document which has *Stainless Washer *in one of my qf
>> > > fields, but *blue *is not there in any of the qf fields. Then how did
>> it
>> > > get returned even though I had given mm=100% (100%25 when I typed
>> > directly
>> > > in browser). Any suggestions please.. In fact this is my first record!
>> > >
>> > > 2.
>> > > Another question I have is:-
>> > > With edismax can I enforce that all my query terms should appear in
>> ANY
>> > of
>> > > my qf fields to qualify as a result document? I know all terms
>> appearing
>> > in
>> > > a single field can give a boost if we use the "pf" query parameter
>> > > accordingly. But how can I insist that to qualify as a result, the doc
>> > > should have ALL of my query term in one or more of the qf fields?
>> > >
>> > >
>> > > Cld some one pls help.
>> > >
>> > > Thanks!
>> > >
>> > > Mark
>> > >
>> > > On Tue, May 3, 2016 at 6:28 PM, Mark Robinson <
>> mark123lea...@gmail.com>
>> > > wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> 1.
>> > >> My understanding of the mm parameter related to edismax is that,
>> > >> if mm=100%,  only if ALL my query terms appear across any of the qf
>> > >> fields will I get back
>> > >> documents ... ie all the terms *need not be present in one single
>> field*
>> > >> .. they just need to be present across any of the fields in my qf
>> list.
>> > >>
>> > >> But my query for  the terms:-
>> > >> *blue stainless washer*
>> > >> ... returns a document which has *Stainless Washer *in one of my qf
>> > >> fields, but *refrigerator *is not there in any of the qf fields. Then
>> > >> how did it get returned even though I had given mm=100% (100%25 when
>> I
>> > >> typed directly in browser). Any suggestions please.
>> > >>
>> > >> 2.
>> > >> Another question I have is:-
>> > >> With edismax can I enforce that all my query terms should appear in
>> ANY
>> > >> of my qf fields to qualify as a result document? I know all terms
>> > appearing
>> > >> in a single field can give a boost if we use the "pf" query parameter
>> > >> accordingly. But how can I insist that to qualify as a result, the
>> doc
>> > >> should have ALL of my query term in one or more of the qf fields?
>> > >>
>> > >>
>> > >> Cld some one pls help.
>> > >>
>> > >> Thanks!
>> > >> Mark.
>> > >>
>> > >
>> > >
>> >
>>
>
>
>
> --
> Jacques du Rand
> Senior R  Programmer
>
> T: +27214688017
> F: +27862160617
> E: jacq...@pricecheck.co.za
> 
>



-- 
Jacques du Rand
Senior R  Programmer

T: +27214688017
F: +27862160617
E: jacq...@pricecheck.co.za



Re: SOLR edismax and mm request parameter

2016-05-04 Thread Jacques du Rand
Although Mark Robinson's answer is correct you are now using the DISMAX not
the Edismax parser...
You can also play around with changing  q.op parameter  to 'AND'



On 4 May 2016 at 11:40, Mark Robinson  wrote:

> Thanks much Ahmet!
>
> I will try that out.
>
> Best,
> Mark
>
> On Tue, May 3, 2016 at 11:53 PM, Ahmet Arslan 
> wrote:
>
> > Hi Mark,
> >
> > You could do something like this:
> >
> > _query_:{!dismax qf='field1' mm='100%' v=$qq}
> > OR
> > _query_:{!dismax qf='field2' mm='100%' v=$qq}
> > OR
> > _query_:{!dismax qf='field3' mm='100%' v=$qq}
> >
> >
> >
> >
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> >
> > Ahmet
> >
> >
> >
> > On Wednesday, May 4, 2016 4:59 AM, Mark Robinson <
> mark123lea...@gmail.com>
> > wrote:
> > Hi,
> > On further checking cld identify that *blue *is indeed appearing in one
> of
> > the qf fields.My bad!
> >
> > Cld someone pls help me with the 2nd question.
> >
> > Thanks!
> > Mark.
> >
> >
> >
> > On Tue, May 3, 2016 at 8:03 PM, Mark Robinson 
> > wrote:
> >
> > > Hi,
> > >
> > > I made a typo err in the prev mail for my first question when I listed
> > the
> > > query terms.
> > > Let me re-type both questions here once again pls.
> > > Sorry for any inconvenience.
> > >
> > > 1.
> > > My understanding of the mm parameter related to edismax is that,
> > > if mm=100%,  only if ALL my query terms appear across any of the qf
> > fields
> > > will I get back
> > > documents ... ie all the terms *need not be present in one single
> field*
> > ..
> > > they just need to be present across any of the fields in my qf list.
> > >
> > > But my query for  the terms:-
> > > *blue stainless washer*
> > > ... returns a document which has *Stainless Washer *in one of my qf
> > > fields, but *blue *is not there in any of the qf fields. Then how did
> it
> > > get returned even though I had given mm=100% (100%25 when I typed
> > directly
> > > in browser). Any suggestions please.. In fact this is my first record!
> > >
> > > 2.
> > > Another question I have is:-
> > > With edismax can I enforce that all my query terms should appear in ANY
> > of
> > > my qf fields to qualify as a result document? I know all terms
> appearing
> > in
> > > a single field can give a boost if we use the "pf" query parameter
> > > accordingly. But how can I insist that to qualify as a result, the doc
> > > should have ALL of my query term in one or more of the qf fields?
> > >
> > >
> > > Cld some one pls help.
> > >
> > > Thanks!
> > >
> > > Mark
> > >
> > > On Tue, May 3, 2016 at 6:28 PM, Mark Robinson  >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> 1.
> > >> My understanding of the mm parameter related to edismax is that,
> > >> if mm=100%,  only if ALL my query terms appear across any of the qf
> > >> fields will I get back
> > >> documents ... ie all the terms *need not be present in one single
> field*
> > >> .. they just need to be present across any of the fields in my qf
> list.
> > >>
> > >> But my query for  the terms:-
> > >> *blue stainless washer*
> > >> ... returns a document which has *Stainless Washer *in one of my qf
> > >> fields, but *refrigerator *is not there in any of the qf fields. Then
> > >> how did it get returned even though I had given mm=100% (100%25 when I
> > >> typed directly in browser). Any suggestions please.
> > >>
> > >> 2.
> > >> Another question I have is:-
> > >> With edismax can I enforce that all my query terms should appear in
> ANY
> > >> of my qf fields to qualify as a result document? I know all terms
> > appearing
> > >> in a single field can give a boost if we use the "pf" query parameter
> > >> accordingly. But how can I insist that to qualify as a result, the doc
> > >> should have ALL of my query term in one or more of the qf fields?
> > >>
> > >>
> > >> Cld some one pls help.
> > >>
> > >> Thanks!
> > >> Mark.
> > >>
> > >
> > >
> >
>



-- 
Jacques du Rand
Senior R  Programmer

T: +27214688017
F: +27862160617
E: jacq...@pricecheck.co.za



Re: solr.ICUCollationField class on cloudera search solr

2016-05-04 Thread tkg_cangkul

hi Ahmet thx for your reply.

i've try using full qualified class name as your suggestion but it still 
failed.


error
On 04/05/16 16:34, Ahmet Arslan wrote:

Hi,

Sometimes using full qualified class name works: using 
org.apache.x.y..z.ICUCollationField instead of solr.ICUCollationField
Ahmet

On Wednesday, May 4, 2016 11:13 AM, tkg_cangkul  wrote:



hi i'm using solr in cloudera.
when i try create core i've got this error message :



i have check the library
/opt/cloudera/parcels/CDH/lib/solr/solr-analysis-extras-4.10.3-cdh5.7.0.jar
and there is solr.ICUCollationField class. but why i still have this
error message?
pls help




Re: SOLR edismax and mm request parameter

2016-05-04 Thread Mark Robinson
Thanks much Ahmet!

I will try that out.

Best,
Mark

On Tue, May 3, 2016 at 11:53 PM, Ahmet Arslan 
wrote:

> Hi Mark,
>
> You could do something like this:
>
> _query_:{!dismax qf='field1' mm='100%' v=$qq}
> OR
> _query_:{!dismax qf='field2' mm='100%' v=$qq}
> OR
> _query_:{!dismax qf='field3' mm='100%' v=$qq}
>
>
>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
>
> Ahmet
>
>
>
> On Wednesday, May 4, 2016 4:59 AM, Mark Robinson 
> wrote:
> Hi,
> On further checking cld identify that *blue *is indeed appearing in one of
> the qf fields.My bad!
>
> Cld someone pls help me with the 2nd question.
>
> Thanks!
> Mark.
>
>
>
> On Tue, May 3, 2016 at 8:03 PM, Mark Robinson 
> wrote:
>
> > Hi,
> >
> > I made a typo err in the prev mail for my first question when I listed
> the
> > query terms.
> > Let me re-type both questions here once again pls.
> > Sorry for any inconvenience.
> >
> > 1.
> > My understanding of the mm parameter related to edismax is that,
> > if mm=100%,  only if ALL my query terms appear across any of the qf
> fields
> > will I get back
> > documents ... ie all the terms *need not be present in one single field*
> ..
> > they just need to be present across any of the fields in my qf list.
> >
> > But my query for  the terms:-
> > *blue stainless washer*
> > ... returns a document which has *Stainless Washer *in one of my qf
> > fields, but *blue *is not there in any of the qf fields. Then how did it
> > get returned even though I had given mm=100% (100%25 when I typed
> directly
> > in browser). Any suggestions please.. In fact this is my first record!
> >
> > 2.
> > Another question I have is:-
> > With edismax can I enforce that all my query terms should appear in ANY
> of
> > my qf fields to qualify as a result document? I know all terms appearing
> in
> > a single field can give a boost if we use the "pf" query parameter
> > accordingly. But how can I insist that to qualify as a result, the doc
> > should have ALL of my query term in one or more of the qf fields?
> >
> >
> > Cld some one pls help.
> >
> > Thanks!
> >
> > Mark
> >
> > On Tue, May 3, 2016 at 6:28 PM, Mark Robinson 
> > wrote:
> >
> >> Hi,
> >>
> >> 1.
> >> My understanding of the mm parameter related to edismax is that,
> >> if mm=100%,  only if ALL my query terms appear across any of the qf
> >> fields will I get back
> >> documents ... ie all the terms *need not be present in one single field*
> >> .. they just need to be present across any of the fields in my qf list.
> >>
> >> But my query for  the terms:-
> >> *blue stainless washer*
> >> ... returns a document which has *Stainless Washer *in one of my qf
> >> fields, but *refrigerator *is not there in any of the qf fields. Then
> >> how did it get returned even though I had given mm=100% (100%25 when I
> >> typed directly in browser). Any suggestions please.
> >>
> >> 2.
> >> Another question I have is:-
> >> With edismax can I enforce that all my query terms should appear in ANY
> >> of my qf fields to qualify as a result document? I know all terms
> appearing
> >> in a single field can give a boost if we use the "pf" query parameter
> >> accordingly. But how can I insist that to qualify as a result, the doc
> >> should have ALL of my query term in one or more of the qf fields?
> >>
> >>
> >> Cld some one pls help.
> >>
> >> Thanks!
> >> Mark.
> >>
> >
> >
>


Re: solr.ICUCollationField class on cloudera search solr

2016-05-04 Thread Ahmet Arslan
Hi,

Sometimes using full qualified class name works: using 
org.apache.x.y..z.ICUCollationField instead of solr.ICUCollationField
Ahmet

On Wednesday, May 4, 2016 11:13 AM, tkg_cangkul  wrote:



hi i'm using solr in cloudera.
when i try create core i've got this error message :



i have check the library
   /opt/cloudera/parcels/CDH/lib/solr/solr-analysis-extras-4.10.3-cdh5.7.0.jar
   and there is solr.ICUCollationField class. but why i still have this
   error message?
pls help


Re: Results of facet differs with change in facet.limit.

2016-05-04 Thread Toke Eskildsen
On Mon, 2016-05-02 at 15:53 +0530, Modassar Ather wrote:
> E.g.
> Query : text_field:term=f=100
> Result :
> 1225
> 1082
> 1076
> 
> Query : text_field:term=f=200
> 1366
> 1321
> 1315

Is the "val1" in your limit=100 test the same term as your "val1" in
your limit=200-test?


Or to phrase it another way: Do you have

limit=100
1225
1082
1076

limit=200
1366
1321
1315
 

or

limit=100
1225
1082
1076

limit=200
1366
1321
1315


- Toke Eskildsen, State and University Library, Denmark



Re: Include and exclude feature with multi valued fileds

2016-05-04 Thread Anil
Hi Ahmet,

in my example DOC 3 also has id 2 (typo mistake). i am using edismax query
parser.

i will try the query you suggested.

Regard,
Anil







On 4 May 2016 at 12:28, Ahmet Arslan  wrote:

>
>
> Hi Anil,
>
> It is weird that your query retrieves docID=2, it has not Facebook at all.
> What query parser are you using?
>
> Please try unary operators and without using quotes.
> q={!lucene} +customers:facebook -customers:google
>
> If I am not wrong above query should do the trick.
>
> But I didn't understand why you expect first document.
> If you really want to include that document too, you can capture it with
> customers:(+facebook +google) clause.
>
> Ahmet
>
>
>
> On Wednesday, May 4, 2016 8:39 AM, Anil  wrote:
> Hi Ahmet,
>
> Thanks for the response. Following are sample documents.
>
> Doc 1 :
>
> id : 1
> customers : ["facebook', "google"]
> issueId:1231
> description: Some description
>
> Doc2 :
>
> id : 2
> customers : ["twitter", "google"]
> issueId:1231
> description: Some description
>
> Doc3 :
>
> id : 2
> customers : ["facebook', "amazon"]
> issueId:1233
> description: Some description
>
> Query pattern : Get documents which include facebook as customer but not
> google
> Expected documents : id = 1, 3
> Used Query : customers:"facebook" and -customers:"google"
> Actual documents : id = 2
>
> Please let me know if I have to change the query to see the expected
> documents. Thanks.
>
> Regards,
> Anil
>
>
> On 3 May 2016 at 19:44, Ahmet Arslan  wrote:
>
> > Can you provide us example documents? Which you want to match which you
> > don't?
> >
> >
> >
> > On Tuesday, May 3, 2016 3:15 PM, Anil  wrote:
> > Any inputs please ?
> >
> >
> > On 2 May 2016 at 18:18, Anil  wrote:
> >
> > > HI,
> > >
> > > i have created a document with multi valued fields.
> > >
> > > Eg :
> > > An issue is impacting multiple customers, products, versions etc.
> > >
> > > In my issue document, i have created customers, products, versions as
> > > multi valued fields.
> > >
> > > how to find all issues that are impacting google (customer) but not
> > > facebook (customer) ?
> > >
> > > Google and facebook can be part of in single issue document.
> > >
> > > Please let me know if you have any questions. Thanks.
> > >
> > > Regards,
> > > Anil
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>


Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-04 Thread deniz
I am trying to go through the steps  here
  
to start playing with the new api, but I am getting:

java.sql.SQLException: java.lang.RuntimeException: First tuple is not a
metadata tuple
at
org.apache.solr.client.solrj.io.sql.StatementImpl.executeQuery(StatementImpl.java:70)
at com.sematext.blog.App.main(App.java:28)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.RuntimeException: First tuple is not a metadata tuple
at
org.apache.solr.client.solrj.io.sql.ResultSetImpl.(ResultSetImpl.java:75)
at
org.apache.solr.client.solrj.io.sql.StatementImpl.executeQuery(StatementImpl.java:67)
... 6 more



My code is 

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;


/**
 * Hello world!
 *
 */
public class App 
{
public static void main( String[] args )
{


Connection connection = null;
Statement statement = null;
ResultSet resultSet = null;

try{
String connectionString =
"jdbc:solr://zkhost:port?collection=test=map_reduce=1";
connection = DriverManager.getConnection(connectionString);
statement  = connection.createStatement();
resultSet = statement.executeQuery("select id, text from test
where tits=1 limit 5");
while(resultSet.next()){
String id = resultSet.getString("id");
String nickname = resultSet.getString("text");

System.out.println(id + " : " + nickname);
}
}catch(Exception e){
e.printStackTrace();
}finally{
if (resultSet != null) {
try {
resultSet.close();
} catch (Exception ex) {
}
}
if (statement != null) {
try {
statement.close();
} catch (Exception ex) {
}
}
if (connection != null) {
try {
connection.close();
} catch (Exception ex) {
}
}
}


}
}


I tried to figure out what is happening, but there is no more logs other
than the one above. And on Solr side, the logs seem okay:

2016-05-04 15:52:30.364 INFO  (qtp1634198-41) [c:test s:shard1 r:core_node1
x:test] o.a.s.c.S.Request [test]  webapp=/solr path=/sql
params={includeMetadata=true=1=json=2.2=select+id,+text+from+test+where+tits%3D1+limit+5=map_reduce}
status=0 QTime=3
2016-05-04 15:52:30.382 INFO  (qtp1634198-46) [c:test s:shard1 r:core_node1
x:test] o.a.s.c.S.Request [test]  webapp=/solr path=/select
params={q=(tits:"1")=false=id,text,score=score+desc=5=json=2.2}
hits=5624 status=0 QTime=1


The error is happening because of some missing handlers on errors on the
code or because of some strict checks on IDE(Ideaj)? Anyone had similar
issues while using sql with solrj?


Thanks

Deniz



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Include and exclude feature with multi valued fileds

2016-05-04 Thread Ahmet Arslan


Hi Anil,

It is weird that your query retrieves docID=2, it has not Facebook at all.
What query parser are you using?

Please try unary operators and without using quotes.
q={!lucene} +customers:facebook -customers:google

If I am not wrong above query should do the trick.

But I didn't understand why you expect first document.
If you really want to include that document too, you can capture it with 
customers:(+facebook +google) clause.

Ahmet



On Wednesday, May 4, 2016 8:39 AM, Anil  wrote:
Hi Ahmet,

Thanks for the response. Following are sample documents.

Doc 1 :

id : 1
customers : ["facebook', "google"]
issueId:1231
description: Some description

Doc2 :

id : 2
customers : ["twitter", "google"]
issueId:1231
description: Some description

Doc3 :

id : 2
customers : ["facebook', "amazon"]
issueId:1233
description: Some description

Query pattern : Get documents which include facebook as customer but not
google
Expected documents : id = 1, 3
Used Query : customers:"facebook" and -customers:"google"
Actual documents : id = 2

Please let me know if I have to change the query to see the expected
documents. Thanks.

Regards,
Anil


On 3 May 2016 at 19:44, Ahmet Arslan  wrote:

> Can you provide us example documents? Which you want to match which you
> don't?
>
>
>
> On Tuesday, May 3, 2016 3:15 PM, Anil  wrote:
> Any inputs please ?
>
>
> On 2 May 2016 at 18:18, Anil  wrote:
>
> > HI,
> >
> > i have created a document with multi valued fields.
> >
> > Eg :
> > An issue is impacting multiple customers, products, versions etc.
> >
> > In my issue document, i have created customers, products, versions as
> > multi valued fields.
> >
> > how to find all issues that are impacting google (customer) but not
> > facebook (customer) ?
> >
> > Google and facebook can be part of in single issue document.
> >
> > Please let me know if you have any questions. Thanks.
> >
> > Regards,
> > Anil
> >
> >
> >
> >
> >
> >
> >
> >
> >
>


Re: Results of facet differs with change in facet.limit.

2016-05-04 Thread Modassar Ather
Thanks Erick for your response.

I checked with distrib=false. I tried with a smaller result set.

*Search*
E.g. text_field:term AND f:val1
Number of matches : 49

*Facet:* (distrib=true)
text_field:term AND f:val1
*Result*
Shard1 :

47

*Facet: *(distrib=false)
text_field:term AND f:val1=false
*Result*
 Shard1 :
 
 44

 Shard3 :
 
 2

 Shard8 :
 
 3

All other shards out of 12 shows 0 count against val1. It seems that the
result of shard3 is not being added to the main result.
Kindly comment.

Best,
Modassar

On Wed, May 4, 2016 at 5:33 AM, Erick Erickson 
wrote:

> Hmm, I'd be interested what you get if you restrict your
> queries to individual shards using =false. This
> will go to the individual shard you address and no others.
>
> Does the facet count change in those circumstances?
>
> Best,
> Erick
>
> On Tue, May 3, 2016 at 4:48 AM, Modassar Ather 
> wrote:
> > I tried to reproduce the same issue with a field of following type but
> > could not.
> >  > stored="false" omitNorms="true"/>
> >
> > Please share your inputs.
> >
> > Best,
> > Modassar
> >
> > On Tue, May 3, 2016 at 10:32 AM, Modassar Ather 
> > wrote:
> >
> >> Hi,
> >>
> >> Kindly share your inputs on this issue.
> >>
> >> Thanks,
> >> Modassar
> >>
> >> On Mon, May 2, 2016 at 3:53 PM, Modassar Ather 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I have a field f which is defined as follows on solr 5.x. It is 12
> shard
> >>> cluster with no replica.
> >>>
> >>>  sortMissingLast="true"
> >>> stored="false" indexed="false" docValues="true"/>
> >>>
> >>> When I facet on this field with different facet.limit I get different
> >>> facet count.
> >>>
> >>> E.g.
> >>> Query : text_field:term=f=100
> >>> Result :
> >>> 1225
> >>> 1082
> >>> 1076
> >>>
> >>> Query : text_field:term=f=200
> >>> 1366
> >>> 1321
> >>> 1315
> >>>
> >>> I am noticing lesser document in facets whereas the numFound during
> >>> search is more. Please refer to following query for details.
> >>>
> >>> Query : text_field:term=f
> >>> Result :
> >>> 1225
> >>> 1082
> >>> 1076
> >>>
> >>> Query : text_field:term AND f:val1
> >>> Result: numFound=1366
> >>>
> >>> Kindly help me understand this behavior or let me know if it is an
> issue.
> >>>
> >>> Thanks,
> >>> Modassar
> >>>
> >>
> >>
>