Re: Solr cloud inquiry

2017-11-17 Thread Jaroslaw Rozanski
Hi James,

This might not be 100% what you are looking for but some ideas to
explore:

1. Change session timeout on ZooKeeper client; this might help you move
unresponsive node to "down" state and Solr Cloud will take affected node
out of rotation on its own.
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkSessions

2. Create own HttpClient with more aggressive connection/socket timeout
values and pass it to CloudSolrClient during construction; if client
timeouts, retry. You can also interrogate ZK what nodes serve given
shard and send request to the other node with distrib=false flag; that
might be more intrusive depending on your shards/data model/queries.

And of all suggestions: fix the infrastructure :)

 Good luck!

--
Jaroslaw Rozanski

On Fri, 17 Nov 2017, at 00:42, kasinger, james wrote:
> Hi,
> 
> We aren’t seeing any exceptions happening for solr during that time. When
> the disk freezes up, solr waits (please refer to the attached gc image
> which shows a period of about a minute where no new objects are created
> in memory). The node is still accepting and stacking requests, and when
> the disk is accessible solr resumes with those threads in healthy state
> albeit with increased latency.
> 
> We’ve explored solutions for marking the node as unhealthy when an
> incident like this occurs, but have determined that the risk of taking it
> out of rotation and impacting the cluster, outweighs the momentary
> latency that we are experiencing.  
> 
> Attached a thread dump to show the jetty theads that pile up while
> solr/storage is in freeze, as well as a graph of total system threads
> increasing and CPU IO wait on the disk.
> 
> It’s a temporary storage outage, though could be viewed as a performance
> issue, and perhaps we need to become aware of more creative ways of
> handling degraded performance… Any ideas?
> 
> Thanks,
> James Kasinger
> 
> 
> On 11/15/17, 8:50 PM, "Jaroslaw Rozanski"  wrote:
> 
> Hi,
> 
> It is interesting that node reports healthy despite store access
> issue.
> That node should be marked down if it can't open the core backing up
> sharded collection.
> 
> Maybe if you could share exceptions/errors that you see in
> console/logs. 
> 
> I have experienced issues with replica node not responding in timely
> manner due to performance issues but that does not seem to match your
> case.
> 
> 
> --
> Jaroslaw Rozanski 
> 
> On Wed, 15 Nov 2017, at 22:49, kasinger, james wrote:
> > Hello folks,
> > 
> > 
> > 
> > To start, we have a sharded solr cloud configuration running solr 
> version
> > 5.1.0 . During shard to shard communication there is a problem state
> > where queries are sent to a replica, and on that replica the storage is
> > inaccessible. The node is healthy so it’s still taking requests which 
> get
> > piled up waiting to read from disk resulting in a latency increase. 
> We’ve
> > tried resolving this storage inaccessibility but it appears related to
> > AWS ebs issues.  Has anyone encountered the same issue?
> > 
> > thanks
> 
> 
> Email had 1 attachment:
> + 23c0_threads_bad.zip
>   24k (application/zip)


Re: Leading wildcard searches very slow

2017-11-17 Thread Amrit Sarkar
Sundeep,

You would like to explore
http://lucene.apache.org/solr/6_6_1/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
here probably.

Thanks
Amrit Sarkar

On 18 Nov 2017 6:06 a.m., "Sundeep T"  wrote:

> Hi,
>
> We have several indexed string fields which is not tokenized  and does not
> have docValues enabled.
>
> When we do leading wildcard searches on these fields they are running very
> slow. We were thinking that since this field is indexed, such queries
> should be running pretty quickly. We are using Solr 6.6.1. Anyone has ideas
> on why these queries are running slow and if there are any ways to speed
> them up?
>
> Thanks
> Sundeep
>


Leading wildcard searches very slow

2017-11-17 Thread Sundeep T
Hi,

We have several indexed string fields which is not tokenized  and does not
have docValues enabled.

When we do leading wildcard searches on these fields they are running very
slow. We were thinking that since this field is indexed, such queries
should be running pretty quickly. We are using Solr 6.6.1. Anyone has ideas
on why these queries are running slow and if there are any ways to speed
them up?

Thanks
Sundeep


Re: External file field

2017-11-17 Thread Walter Underwood
Thanks. I found this, which is much more clear than the manual.

http://www.openjems.com/solr-external-file-fields/

The Solr manual should include the info about how to declare the field.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 17, 2017, at 2:48 PM, Chris Hostetter  wrote:
> 
> 
> : Do I need to define a field with  when I use an external file 
> : field? I see the  to define it, but the docs don’t say how 
> : to define the field.
> 
> you define the field (or dynamicField) just like any other field -- the 
> fieldType is where you specify things like the 'keyField' & the 'defVal', 
> but then the field/dynamicField definition dictate the underlying 
> filename that will be used
> 
> So if you want 5 diff ExternalFileFields that all use keyField="id" but you 
> only need one  and five  -- but if you need them all 
> to have 5 diff 'defVal' then you need five  and five 
>  
> 
> 
> 
> -Hoss
> http://www.lucidworks.com/



Re: External file field

2017-11-17 Thread Chris Hostetter

: Do I need to define a field with  when I use an external file 
: field? I see the  to define it, but the docs don’t say how 
: to define the field.

you define the field (or dynamicField) just like any other field -- the 
fieldType is where you specify things like the 'keyField' & the 'defVal', 
but then the field/dynamicField definition dictate the underlying 
filename that will be used

So if you want 5 diff ExternalFileFields that all use keyField="id" but you 
only need one  and five  -- but if you need them all 
to have 5 diff 'defVal' then you need five  and five 
 



-Hoss
http://www.lucidworks.com/


External file field

2017-11-17 Thread Walter Underwood
Do I need to define a field with  when I use an external file field? I 
see the  to define it, but the docs don’t say how to define the 
field.

The docs say that the file uses the fieldname as part of the filename, but the 
 directive defines a type name, not a field name. Right?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)




Solr7: Very High number of threads on aggregator node

2017-11-17 Thread Nawab Zada Asad Iqbal
Hi,

I have a sharded solr7 cluster and I am using an aggregator node (which has
no data/index of its own) to distribute queries and aggregate results from
the shards. I am puzzled that when I use solr7 on the aggregator node, then
number of threads shoots up to 32000 on that host and then the process
reaches its memory limits. However, when i use solr4 on the aggregator,
then it all seems to work fine. The peak number of threads during my
testing were around 4000 or so. The test load is same in both cases, except
that it doesn't finish in case of solr7 (due to the memory / thread
issue).
The memory settings and Jetty  threadpool setting (max=1) are also
consistent in both servers (solr 4 and solr 7).


Has anyone else been in similar circumstances?


Thanks
Nawab


FOSS Backstage Micro Summit on Monday in Berlin

2017-11-17 Thread Uwe Schindler
Hi,

It's already a bit late, but all people who are visiting Germany next week and 
want to do a short trip to Berlin: There are still slots free on the FOSS 
Backstage Micro Summit. It is a mini conference conference on everything 
related to governance, collaboration, legal and economics within the scope of 
FOSS. The main event will take place as part of berlinbuzzwords 2018. We have a 
lot of speakers invited - also from ASF!

https://www.foss-backstage.de/

Program:
https://www.foss-backstage.de/news/micro-summit-program-online-now

I hope to see you there,
Uwe

-
Uwe Schindler
uschind...@apache.org 
ASF Member, Apache Lucene PMC / Committer
Bremen, Germany
http://lucene.apache.org/




Re: TimeZone issue

2017-11-17 Thread Chris Hostetter
: 
: As I said before, I do not think that Solr will use timezones for date display
: -- ever.  Solr does support timezones in certain circumstances, but I'm pretty

One possibility that has been discussed in the pst is the idea of a "Date 
Formatting DocTransformer" that would always return a String in the 
specified format (regardless of the ResponseWriter and any native 
support for Dates it has, ala javabin or xml).

That would be a fairly straight forward plugin to write if someone was so 
inclined -- the hardest part would be deciding on the syntax, so that you 
could specify the clients prefered format, timezone, locale, etc  But 
then folks who really want to pass off Solr's csv/json/whatever 
responsewriter format directly to an end consumer could control that.

(likewise we could imagine a "Number Formatting DocTransformer" that would 
do the same thing for people that really want their Integer's to come back 
as "1,234,567" or their floats to be formated to exactly 4 decimal places, 
etc...)


-Hoss
http://www.lucidworks.com/

Re: Editing the community wiki of Solr

2017-11-17 Thread Erick Erickson
Done.

On Thu, Nov 16, 2017 at 1:56 AM, Cedric Ulmer
 wrote:
> Hi Solr community,
>
>
>
> I need to update the support page of wiki.apache.org wrt what France Labs
> does on Solr. Can you add the user <  FranceLabs > so that we can edit the
> page ?
>
>
>
> Regards,
>
>
> Cedric
>
>
>
> President
>
> France Labs - Les experts du Search
>


Editing the community wiki of Solr

2017-11-17 Thread Cedric Ulmer
Hi Solr community,

 

I need to update the support page of wiki.apache.org wrt what France Labs
does on Solr. Can you add the user <  FranceLabs > so that we can edit the
page ?

 

Regards,


Cedric

 

President

France Labs - Les experts du Search 



RE: Search suggester - threshold parameter

2017-11-17 Thread Peter Lancaster
Hi Ruby,

The documentation says that threshold is available for the 
HighFrequencyDictionaryFactory implementation. Since you're using 
DocumentDictionaryFactory I guess it will be ignored.

Cheers,
Peter.

-Original Message-
From: ruby [mailto:rshoss...@gmail.com]
Sent: 17 November 2017 15:41
To: solr-user@lucene.apache.org
Subject: Search suggester - threshold parameter

Does any of the phrase suggesters in  Solr 6.1 honor the threshold parameter?

I made following changes to enable phrase suggestion in my environment.
Played with different threshold values but looks like the parameter is not 
being used.


  
mySuggester
FuzzyLookupFactory
suggester_fuzzy_dir



DocumentDictionaryFactory
title
suggestType
false
false
*0.005*
  



  
true
10
mySuggester
  
  
suggest
  




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


This message is confidential and may contain privileged information. You should 
not disclose its contents to any other person. If you are not the intended 
recipient, please notify the sender named above immediately. It is expressly 
declared that this e-mail does not constitute nor form part of a contract or 
unilateral obligation. Opinions, conclusions and other information in this 
message that do not relate to the official business of findmypast shall be 
understood as neither given nor endorsed by it.


__

This email has been checked for virus and other malicious content prior to 
leaving our network.
__


Re: get all tokens from TokenStream in my custom filter

2017-11-17 Thread Ahmet Arslan
 Hi Kumar,
If I am not wrong, I think there is method named something like peek(2) or 
advance(2).Some filters access tokens ahead and perform some logic.
AhmetOn Wednesday, November 15, 2017, 10:50:55 PM GMT+3, kumar gaurav 
 wrote:  
 
 Hi

I need to get full field value from TokenStream in my custom filter class .

I am using this

stream.reset();
while (tStream.incrementToken()) {
    term += " "+charTermAttr.toString();
}
stream.end();
stream.close();

this is ending streaming . no token is producing if i am using this .

I want to get full string without hampering token creation .

Eric ! Are you there ? :)  Anyone Please help  ?
  

Re: DocValues

2017-11-17 Thread S G
Thank you Erick and Shawn.


1) So it seems like docValues should always be preferred over stored fields
for retreival
if sorting-of-multivalued-fields is not a concern. Is that a correct
understanding?


2) Also, the in-place atomic updates (with docValues=true and
stored/indexed=false) should
be much faster than regular atomic updates (with docValues=anything and
stored/indexed=true).
This is so because in-place updates are just looking up the document
corresponding to the
field in the columnar-oriented lookup and changing the value there. The
document itself is
not re-indexed because stored is false and indexed is false for an in-place
update. If there is
any bench-mark to verify this, it would be great.


3) If the performance is dreadful to search with docValue=true,
indexed=false fields, then
why is that even allowed? Shouldn't Solr just give an error for such cases?


Thanks
SG




On Fri, Nov 17, 2017 at 6:50 AM, Erick Erickson 
wrote:

> I'll add that using docValues in place of stored is much more
> efficient than using stored. To access stored=true data
> 1> a 16K block must be read from disk
> 2> the 16K block must be decompressed.
>
> With docValues, the value is a simple lookup, the value is probably in
> memory already (MMapped) and the decompression of a large block is
> unnecessary.
>
> There is one caveat: docValues uses (for multiValued fields) a
> SORTED_SET. Therefore multiple identical values are collapsed and the
> values are sorted. So if your input was
> 5, 6, 3, 4, 3, 3, 3
> the retrieved values would be
> 3, 4, 5, 6
>
> If this is NOT ok for your app, then you should use stored values to
> retrieve. Otherwise DocValues is preferred.
>
> Best,
> Erick
>
> On Fri, Nov 17, 2017 at 5:44 AM, Shawn Heisey  wrote:
> > On 11/17/2017 12:53 AM, S G wrote:
> >>
> >> Going through
> >>
> >> https://www.elastic.co/guide/en/elasticsearch/guide/
> current/_deep_dive_on_doc_values.html
> >> ,
> >> is it possible to enable only docValues and disable stored/indexed
> >> attributes for a field?
> >
> >
> > Yes, this is possible.  In fact, if you want to do in-place Atomic
> updates,
> > this is how the field must be set up.
> >
> > https://lucene.apache.org/solr/guide/6_6/updating-parts-
> of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates
> >
> >> In that case, the field will become only sortable/facetable/pivotable
> but
> >> it cannot be searched nor can it be retrieved?
> >
> >
> > Recent Solr versions can use docValues instead of stored when retrieving
> > data for results.  This can be turned on/off on a per-field basis.  The
> > default setting is enabled if you're using a current schema version.
> >
> > https://lucene.apache.org/solr/guide/6_6/docvalues.html#DocValues-
> RetrievingDocValuesDuringSearch
> >
> > As I understand it, you actually *can* search docValues-only fields
> (which
> > would require a match to the entire field -- no text analysis), but
> because
> > it works similarly to a full-table scan in a database, the performance is
> > dreadful on most fields, and it's NOT recommended.
> >
> > Thanks,
> > Shawn
>


Search suggester - threshold parameter

2017-11-17 Thread ruby
Does any of the phrase suggesters in  Solr 6.1 honor the threshold parameter? 

I made following changes to enable phrase suggestion in my environment.
Played with different threshold values but looks like the parameter is not
being used.


  
mySuggester
FuzzyLookupFactory
suggester_fuzzy_dir



DocumentDictionaryFactory
title
suggestType
false
false
*0.005*
  



  
true
10
mySuggester
  
  
suggest
  




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Multipath hierarchical faceting

2017-11-17 Thread Erick Erickson
Not quite sure if this suits your needs, but what about:
PathHierarchyTokenizerFactory?

On Fri, Nov 17, 2017 at 5:24 AM, Emir Arnautović
 wrote:
> Hi,
> In order to use this feature, you would have to patch Solr and build it 
> yourself. But note that this ticket is old and the last patch version is from 
> 2014, so you would either have to patch referenced version of adjust patch to 
> work with version that you are targeting.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 17 Nov 2017, at 11:28, 喜阳阳 <389610...@qq.com> wrote:
>>
>> Hi
>> I am a newbie and I do not unserstand SOLR-2412 I am trying Solr and i have 
>> one question.I want to use Multipath hierarchical faceting but i do not konw 
>> how to do it .Because i want to use it with Solr API.
>> Is there any class in newest Solr api to specify the HierarchicalFacetField 
>> like org.apache.solr.schema.HierarchicalFacetField in  Solr 3.10?Looking 
>> forward to your reply.Thank you very much.
>


Re: DocValues

2017-11-17 Thread Erick Erickson
I'll add that using docValues in place of stored is much more
efficient than using stored. To access stored=true data
1> a 16K block must be read from disk
2> the 16K block must be decompressed.

With docValues, the value is a simple lookup, the value is probably in
memory already (MMapped) and the decompression of a large block is
unnecessary.

There is one caveat: docValues uses (for multiValued fields) a
SORTED_SET. Therefore multiple identical values are collapsed and the
values are sorted. So if your input was
5, 6, 3, 4, 3, 3, 3
the retrieved values would be
3, 4, 5, 6

If this is NOT ok for your app, then you should use stored values to
retrieve. Otherwise DocValues is preferred.

Best,
Erick

On Fri, Nov 17, 2017 at 5:44 AM, Shawn Heisey  wrote:
> On 11/17/2017 12:53 AM, S G wrote:
>>
>> Going through
>>
>> https://www.elastic.co/guide/en/elasticsearch/guide/current/_deep_dive_on_doc_values.html
>> ,
>> is it possible to enable only docValues and disable stored/indexed
>> attributes for a field?
>
>
> Yes, this is possible.  In fact, if you want to do in-place Atomic updates,
> this is how the field must be set up.
>
> https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates
>
>> In that case, the field will become only sortable/facetable/pivotable but
>> it cannot be searched nor can it be retrieved?
>
>
> Recent Solr versions can use docValues instead of stored when retrieving
> data for results.  This can be turned on/off on a per-field basis.  The
> default setting is enabled if you're using a current schema version.
>
> https://lucene.apache.org/solr/guide/6_6/docvalues.html#DocValues-RetrievingDocValuesDuringSearch
>
> As I understand it, you actually *can* search docValues-only fields (which
> would require a match to the entire field -- no text analysis), but because
> it works similarly to a full-table scan in a database, the performance is
> dreadful on most fields, and it's NOT recommended.
>
> Thanks,
> Shawn


Solr 6.6.0 - Query does not implement createWeight

2017-11-17 Thread Mihail Radkov
Hello everybody! 

I've migrated from Solr 6.5.0 to Solr 6.6.0 but it introduced a problem with
wildcards in join edismax queries. 

My documents looks similar to this: 
{ 
"id": "1", 
"title": "testing", 
"hasAttachment": ["2"] 
} 
{ 
"id": "2", 
"title": "an attachment" 
} 

So to query all documents that match test* but also match any which are
attachments to those documents but with lower score, I use the following
query parameters: 

{ 
  "q":"({!join score=max from=id to=hasAttachment }{!edismax v=$uq
qf=title} OR {!edismax v=$uq})", 
  "defType":"edismax", 
  "indent":"on", 
  "qf":"title^10", 
  "fl":"id,score", 
  "uq":"test*", // uq is custom parameter for terms entered by users
used as variable in q 
  "wt":"json" 
} 

However in Solr 6.6.0 the wildcard in test* is causing Solr to throw an
exception: 
java.lang.UnsupportedOperationException: Query title:test* does not
implement createWeight 

If I remove score=max, Solr returns results but it's not ranking the
attachments 

I managed to reproduce the problem with one of the shipped test data in
Solr: 

1) Download Solr 6.6.0 

2) Start it with the provided test data techproducts 
./bin/solr start -e=techproducts 

3) Executing the next join query returns the expected results 

http://localhost:8983/solr/techproducts/select?defType=edismax=on=({!join%20from=manu_id_s%20to=id%20score=max}{!edismax%20v=$uq%20qf=name}%20OR%20{!edismax%20v=$uq})=name^2=canon=json
 

4) However if I change uq to include a wildcard, e.g. uq=cano* 

http://localhost:8983/solr/techproducts/select?defType=edismax=on=({!join%20from=manu_id_s%20to=id%20score=max}{!edismax%20v=$uq%20qf=name}%20OR%20{!edismax%20v=$uq})=name^2=cano*=json
 

this results in the following exception: 

java.lang.UnsupportedOperationException: Query name:cano* does not implement
createWeight 
   at org.apache.lucene.search.Query.createWeight(Query.java:66) 
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751) 
   at
org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.(DisjunctionMaxQuery.java:106)
 
   at
org.apache.lucene.search.DisjunctionMaxQuery.createWeight(DisjunctionMaxQuery.java:190)
 
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751) 
   at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:60) 
   at
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:225) 
   at
org.apache.lucene.search.join.TermsIncludingScoreQuery.createWeight(TermsIncludingScoreQuery.java:100)
 
   at
org.apache.solr.search.join.ScoreJoinQParserPlugin$SameCoreJoinQuery.createWeight(ScoreJoinQParserPlugin.java:171)
 
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751) 
   at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:60) 
   at
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:225) 
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751) 
   at
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:734)
 
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472) 
   at
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:217)
 
   at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1582)
 
   at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1399)
 
   at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:566) 
   at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:545)
 
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)
 
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
 
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) 
   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) 
   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) 
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
 
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
 
   at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
 
   at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) 
   at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) 
   at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) 
   at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
 
   at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
 
   at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) 
   at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 
   at

Re: A problem of tracking the commits of Lucene using SHA num

2017-11-17 Thread Shawn Heisey

On 11/16/2017 5:28 PM, TOM wrote:

  Recently, I acquired a batch of commits’ SHA data of Lucene, of which the 
time span is from 2010 to 2015. In order to get original info, I tried to use 
these SHA data to track commits.





In summary, 1) did the method to generate SHA num of commit change once before? 
2) because the second mirror repository ended its update since 2014, how can I 
track the whole commits of my dataset?


When you asked this same question on November 9th (and even earlier on 
the dev list), I replied with this information, and what I told you was 
confirmed by Chris Hostetter.


Yes, all of the SHA data in the github repository *has* changed.  This 
event happened in early to mid January 2016.


At that time, the Lucene/Solr project migrated the source repository 
from subversion to git.  Before this, the github mirror was (as stated 
by Chris), an automated realtime git->svn conversion set up by Apache 
Infra.  It was actually the github mirror that forced our hand to 
complete our own repository migration -- that realtime svn->git 
conversion was requiring so many resources that it was crashing the 
conversion process for other projects.  Infra was going to turn the 
conversion of our repository off, and we would no longer *have* an 
up-to-date github mirror.  As a project, we had been discussing the 
conversion already, but it was that problem that pushed people into action.


The official conversion of the repository from svn to git produced a 
repository with entirely different commit SHA values than the old github 
mirror, and we couldn't use it to maintain that mirror.  The github 
mirror that matched your commit data was completely deleted, and then 
rebuilt as a true mirror of the official git repository.


The commit data you're using is nearly useless, because the repository 
where it originated has been gone for nearly two years.  If you can find 
out how it was generated, you can build a new version from the current 
repository -- either on github or from Apache's official servers.


Thanks,
Shawn


Solr 6.6.0 - Query does not implement createWeight

2017-11-17 Thread Mihail Radkov
Hello everybody!

I've migrated from Solr 6.5.0 to Solr 6.6.0 but it introduced a problem with
wildcards in join edismax queries.

My documents looks similar to this:
{
"id": "1",
"title": "testing",
"hasAttachment": ["2"]
}
{
"id": "2",
"title": "an attachment"
}

So to query all documents that match test* but also match any which are
attachments to those documents but with lower score, I use the following
query parameters:

{
  "q":"({!join score=max from=id to=hasAttachment }{!edismax v=$uq
qf=title} OR {!edismax v=$uq})",
  "defType":"edismax",
  "indent":"on",
  "qf":"title^10",
  "fl":"id,score",
  "uq":"test*", // uq is custom parameter for terms entered by users
used as variable in q
  "wt":"json"
}

However in Solr 6.6.0 the wildcard in test* is causing Solr to throw an
exception:
java.lang.UnsupportedOperationException: Query title:test* does not
implement createWeight

If I remove score=max, Solr returns results but it's not ranking the
attachments

I managed to reproduce the problem with one of the shipped test data in
Solr:

1) Download Solr 6.6.0

2) Start it with the provided test data techproducts
./bin/solr start -e=techproducts

3) Executing the next join query returns the expected results

http://localhost:8983/solr/techproducts/select?defType=edismax=on=({!join%20from=manu_id_s%20to=id%20score=max}{!edismax%20v=$uq%20qf=name}%20OR%20{!edismax%20v=$uq})=name^2=canon=json

4) However if I change uq to include a wildcard, e.g. uq=cano*

http://localhost:8983/solr/techproducts/select?defType=edismax=on=({!join%20from=manu_id_s%20to=id%20score=max}{!edismax%20v=$uq%20qf=name}%20OR%20{!edismax%20v=$uq})=name^2=cano*=json

this results in the following exception:

java.lang.UnsupportedOperationException: Query name:cano* does not implement
createWeight
   at org.apache.lucene.search.Query.createWeight(Query.java:66)
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751)
   at
org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.(DisjunctionMaxQuery.java:106)
   at
org.apache.lucene.search.DisjunctionMaxQuery.createWeight(DisjunctionMaxQuery.java:190)
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751)
   at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:60)
   at
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:225)
   at
org.apache.lucene.search.join.TermsIncludingScoreQuery.createWeight(TermsIncludingScoreQuery.java:100)
   at
org.apache.solr.search.join.ScoreJoinQParserPlugin$SameCoreJoinQuery.createWeight(ScoreJoinQParserPlugin.java:171)
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751)
   at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:60)
   at
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:225)
   at
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:751)
   at
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:734)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
   at
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:217)
   at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1582)
   at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1399)
   at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:566)
   at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:545)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
   at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
   at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
   at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
   at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
   at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
   at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
   at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
   at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
   at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
   at

Re: DocValues

2017-11-17 Thread Shawn Heisey

On 11/17/2017 12:53 AM, S G wrote:

Going through
https://www.elastic.co/guide/en/elasticsearch/guide/current/_deep_dive_on_doc_values.html
,
is it possible to enable only docValues and disable stored/indexed
attributes for a field?


Yes, this is possible.  In fact, if you want to do in-place Atomic 
updates, this is how the field must be set up.


https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates


In that case, the field will become only sortable/facetable/pivotable but
it cannot be searched nor can it be retrieved?


Recent Solr versions can use docValues instead of stored when retrieving 
data for results.  This can be turned on/off on a per-field basis.  The 
default setting is enabled if you're using a current schema version.


https://lucene.apache.org/solr/guide/6_6/docvalues.html#DocValues-RetrievingDocValuesDuringSearch

As I understand it, you actually *can* search docValues-only fields 
(which would require a match to the entire field -- no text analysis), 
but because it works similarly to a full-table scan in a database, the 
performance is dreadful on most fields, and it's NOT recommended.


Thanks,
Shawn


Re: Multipath hierarchical faceting

2017-11-17 Thread Emir Arnautović
Hi,
In order to use this feature, you would have to patch Solr and build it 
yourself. But note that this ticket is old and the last patch version is from 
2014, so you would either have to patch referenced version of adjust patch to 
work with version that you are targeting.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 17 Nov 2017, at 11:28, 喜阳阳 <389610...@qq.com> wrote:
> 
> Hi
> I am a newbie and I do not unserstand SOLR-2412 I am trying Solr and i have 
> one question.I want to use Multipath hierarchical faceting but i do not konw 
> how to do it .Because i want to use it with Solr API. 
> Is there any class in newest Solr api to specify the HierarchicalFacetField 
> like org.apache.solr.schema.HierarchicalFacetField in  Solr 3.10?Looking 
> forward to your reply.Thank you very much.



Multipath hierarchical faceting

2017-11-17 Thread ??????
Hi
I am a newbie and I do not unserstand SOLR-2412 I am trying Solr and i have one 
question.I want to use Multipath hierarchical faceting but i do not konw how to 
do it .Because i want to use it with Solr API. 
Is there any class in newest Solr api to specify the HierarchicalFacetField 
like org.apache.solr.schema.HierarchicalFacetField in  Solr 3.10?Looking 
forward to your reply.Thank you very much.