Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread sanjay dutt
Best explanation I found so far. Will migrate to LatLonPointSpatialField and 
try to share the benchmark data here. Thanks again David.
Cheers,Sanjay

Sent from Yahoo Mail on Android 
 
  On Sat, Aug 8, 2020 at 3:31 AM, David Smiley wrote:   
Since you have a typical use-case (point data, queries that are
rectangles), I strongly encourage you to migrate to LatLonPointSpatialField:

https://builds.apache.org/job/Solr-reference-guide-master/javadoc/spatial-search.html#latlonpointspatialfield
It's based off an internal "BKD" tree index (doesn't use FSTs) which is
different than the terms based index used by the RPT field that you are
using which employes FSTs.  To be clear, FSTs are awesome but the BKD index
is tailored for numeric data whereas terms/FSTs are not.

If your FSTs are/were taking up so much memory, you are probably not using
Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least
the ones associated with the field indexes.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt
 wrote:

> FieldType defined with class solr.SpatialRecursivePrefixTreeFieldType
>
> In this we are adding points only although collection has few fields with
> points data and then other fieldTypes as well.
> And one of the queries looks like
> (my_field: [45,-94 TO 46,-93]+OR+my_field: [42,-94 TO 43,-93])
>
> Thanks and Regards,Sanjay Dutt
>
>    On Thursday, August 6, 2020, 12:10:04 AM GMT+5:30, David Smiley <
> dsmi...@apache.org> wrote:
>
>  What is the Solr field type definition for this field?  And what sort of
> spatial data do you add here -- just points or what?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
>  wrote:
>
> > Hello Solr community,
> > On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> > of instances. When I download the HEAP DUMP and analyzed it. I got to
> know
> > that in multiple HEAP DUMPS there are lots of instances
> > of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> > highest retained heap memory and further I have checked the
> > outgoing-reference for those objects,
> > the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the
> heap
> > memory.
> > it's like
> > Production HEAP memory :- 12GBout of
> > which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total
> retained
> > heap :- 7-8 GB(vary from instance to
> > instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> > Upon further looking I have calculated the total retained heap for
> > FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the
> same
> > reader which also contains reference to org.apache.lucene.util.fst.FST.
> > Now "my_field" is the field on which we are performing spatial searches.
> > Is spatial searches use FST internally and hence we are seeing lot of
> heap
> > memory used by FST.l only.
> > IS there any way we can optimize the spatial searches so that it take
> less
> > memory.
> > Can someone please give me any pointer that from where Should I start
> > looking to debug the above issue.
> > Thanks and Regards,Sanjay Dutt
> > Sent from Yahoo Mail on Android
>
  


RE: solr startup

2020-08-07 Thread Schwartz, Tony
suggester?  what do i need to look for in the configs?

Tony



Sent from my Verizon, Samsung Galaxy smartphone



 Original message 
From: Dave 
Date: 8/7/20 18:23 (GMT-05:00)
To: solr-user@lucene.apache.org
Subject: Re: solr startup

It sounds like you have suggester indexes being built on startup.  Without them 
they just come up in a second or so

> On Aug 7, 2020, at 6:03 PM, Schwartz, Tony  wrote:
>
> I have many collections.  When I start solr, it takes 30 - 45 minutes to 
> start up and load all the collections.  My collections are named per day.  
> During startup, solr loads the collections in alpha-numeric name order.  I 
> would like solr to load the collections in the descending order.  So the most 
> recent collections are loaded first and are available for searching while the 
> older collections are not as important.  Is this possible?
>
>


Re: solr startup

2020-08-07 Thread Dave
It sounds like you have suggester indexes being built on startup.  Without them 
they just come up in a second or so

> On Aug 7, 2020, at 6:03 PM, Schwartz, Tony  wrote:
> 
> I have many collections.  When I start solr, it takes 30 - 45 minutes to 
> start up and load all the collections.  My collections are named per day.  
> During startup, solr loads the collections in alpha-numeric name order.  I 
> would like solr to load the collections in the descending order.  So the most 
> recent collections are loaded first and are available for searching while the 
> older collections are not as important.  Is this possible?
> 
> 


solr startup

2020-08-07 Thread Schwartz, Tony
I have many collections.  When I start solr, it takes 30 - 45 minutes to start 
up and load all the collections.  My collections are named per day.  During 
startup, solr loads the collections in alpha-numeric name order.  I would like 
solr to load the collections in the descending order.  So the most recent 
collections are loaded first and are available for searching while the older 
collections are not as important.  Is this possible?




Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread David Smiley
Since you have a typical use-case (point data, queries that are
rectangles), I strongly encourage you to migrate to LatLonPointSpatialField:

https://builds.apache.org/job/Solr-reference-guide-master/javadoc/spatial-search.html#latlonpointspatialfield
It's based off an internal "BKD" tree index (doesn't use FSTs) which is
different than the terms based index used by the RPT field that you are
using which employes FSTs.  To be clear, FSTs are awesome but the BKD index
is tailored for numeric data whereas terms/FSTs are not.

If your FSTs are/were taking up so much memory, you are probably not using
Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least
the ones associated with the field indexes.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt
 wrote:

> FieldType defined with class solr.SpatialRecursivePrefixTreeFieldType
>
> In this we are adding points only although collection has few fields with
> points data and then other fieldTypes as well.
> And one of the queries looks like
> (my_field: [45,-94 TO 46,-93]+OR+my_field: [42,-94 TO 43,-93])
>
> Thanks and Regards,Sanjay Dutt
>
> On Thursday, August 6, 2020, 12:10:04 AM GMT+5:30, David Smiley <
> dsmi...@apache.org> wrote:
>
>  What is the Solr field type definition for this field?  And what sort of
> spatial data do you add here -- just points or what?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
>  wrote:
>
> > Hello Solr community,
> > On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> > of instances. When I download the HEAP DUMP and analyzed it. I got to
> know
> > that in multiple HEAP DUMPS there are lots of instances
> > of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> > highest retained heap memory and further I have checked the
> > outgoing-reference for those objects,
> > the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the
> heap
> > memory.
> > it's like
> > Production HEAP memory :- 12GBout of
> > which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total
> retained
> > heap :- 7-8 GB(vary from instance to
> > instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> > Upon further looking I have calculated the total retained heap for
> > FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the
> same
> > reader which also contains reference to org.apache.lucene.util.fst.FST.
> > Now "my_field" is the field on which we are performing spatial searches.
> > Is spatial searches use FST internally and hence we are seeing lot of
> heap
> > memory used by FST.l only.
> > IS there any way we can optimize the spatial searches so that it take
> less
> > memory.
> > Can someone please give me any pointer that from where Should I start
> > looking to debug the above issue.
> > Thanks and Regards,Sanjay Dutt
> > Sent from Yahoo Mail on Android
>


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread Shawn Heisey

On 8/7/2020 9:30 AM, yaswanth kumar wrote:

solr/PROXIMITY_DATA_V2/select?q=pkey:223_*=true=country_en=country_en





What ever I am trying is not working other than sending wt=xml as a
parameter while hitting the url.


I tried your solrconfig.xml addition and a URL similar to yours out on 
8.5.1, using the techproducts example.  The results were in XML.


I'm betting that you modified a copy of solrconfig.xml that is *NOT* the 
correct one for PROXIMITY_DATA_V2.  Or that after modifying it, you did 
not reload the core or restart Solr.


If your Solr server is in cloud mode, then you must modify the 
solrconfig.xml that lives in the ZooKeeper database, under the config in 
use for the collection.


If your server is not in cloud mode, then the relevant file is most 
likely to be the solrconfig.xml that is in the core's "conf" directory. 
Modifying the version of the file under the configsets directory after 
the core is created will not change anything.


I am also curious about the answer to Alexandre's question -- what is in 
the echoed parameters found in the incorrect response?  Setting 
echoParams to "all" as you have done can be very useful for this.


Thanks,
Shawn


Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Chris Hostetter

: Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
: which component of the application screws it up, but at the moment I do NOT
: believe it is related to Solrj.

You can use the "forbidden-apis" project to analyze your code and look for 
uses of APIs that depend on the default file encoding, locale, charset, 
etc...

https://github.com/policeman-tools/forbidden-apis

...this project started as an offshoot of build rules in 
Lucene/Solr, precisely to help detect problems like the one you 
are facing -- and it's used to analyze all Solr code, which is why i'm 
pretty confident that no SolrJ code is mistakenly 
parsing/converting/encoding your input -- allghough in theory it could be 
a 3rd party library Solr uses.  (Hardcoding the unicode string in your 
java application and passing it as a solr param should help prove/disprove 
that)

: 
: On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:
: 
: > Dear all,
: >
: > I have the following issues. I have a Solrj Client 8.6 (but it happens
: > also in previous versions), where I execute, for example, the following
: > query:
: > Jörn
: >
: > If I look into Solr Admin UI it finds all the right results.
: >
: > If I use Solrj client then it does not find anything.
: > Further, investigating in debug mode it seems that the URI to server gets
: > wrongly encoded.
: > Jörn becomes J%C3%83%C2%B6rn
: > It should become only J%C3%B6rn
: > any idea why this happens and why it add %83%C2 inbetween? Those do not
: > seem to be even valid UTF-8 characters
: >
: > I verified with various statements that I give to Solrj the correct
: > encoded String "Jörn"
: >
: > Can anyone help me here?
: >
: > Thank you.
: >
: > best regards
: >
: 

-Hoss
http://www.lucidworks.com/

RE: Getting rid of Master/Slave nomenclature in Solr

2020-08-07 Thread Jamie Gruener
Marcus,

Thank you for tackling this. I'm not a developer, just a user, so my ability to 
help is limited to moral support. And I support your efforts 100%.

Thank you,

--Jamie

-Original Message-
From: Marcus Eagan  
Sent: Monday, August 3, 2020 12:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Getting rid of Master/Slave nomenclature in Solr

Here is some of the work I did to remedy this effort before I knew about this 
email:

https://github.com/apache/lucene-solr/pull/1712

https://issues.apache.org/jira/browse/SOLR-14702page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17169865

It makes me sick to read master/slave and this issue has alienated a buddy I've 
tried to recruit to volunteer on the project. All comments welcome, but please 
read the above docs as I will review this email thread now to understand what 
has already been discussed. I put in a lot of work to get this solid.

Happy to discuss.

Marcus

On 2020/06/17 19:37:20, Anshum Gupta wrote:

> Hi everyone,>

>

> Moving a conversation that was happening on the PMC list to the public>

> forum. Most of the following is just me recapping the conversation that has>

> happened so far.>

>

> Some members of the community have been discussing getting rid of the>

> master/slave nomenclature from Solr.>

>

> While this may require a non-trivial effort, a general consensus so far>

> seems to be to start this process and switch over incrementally, if a>

> single change ends up being too big.>

>

> There have been a lot of suggestions around what the new nomenclature might>

> look like, a few people don’t want to overlap the naming here with what>

> already exists in SolrCloud i.e. leader/follower.>

>

> Primary/Replica was an option that was suggested based on what other>

> vendors are moving towards based on Wikipedia:>

> https://en.wikipedia.org/wiki/Master/slave_(technology) >

> , however there were concerns around the use of “replica” as that denotes a>

> very specific concept in SolrCloud. Current terminology clearly>

> differentiates the use of the traditional replication model from SolrCloud>

> and reusing the names would make it difficult for that to happen.>

>

> There were similar concerns around using Leader/follower.>

>

> Let’s continue this conversation here while making sure that we converge>

> without much bike-shedding.>

>

> -Anshum>

>

Sent via Superhuman ( https://sprh.mn/?vip=m...@marcuseagan.com )


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread Alexandre Rafalovitch
You have echoParams set to all. What does that return?

Regards,
   Alex

On Fri., Aug. 7, 2020, 11:31 a.m. yaswanth kumar, 
wrote:

> Thanks for looking into this Erick,
>
>
> solr/PROXIMITY_DATA_V2/select?q=pkey:223_*=true=country_en=country_en
>
> that's what the url I am hitting, and also I made sure that initParams is
> all commented like this and also I made sure that there is no uncommneted
> section defined for initParams.
> 
> Also from the solrcloud I did make sure that I am checking the correct
> collection and verified solrconfig.xml by choosing the collection and
> browsing files within the same collection.
>
> What ever I am trying is not working other than sending wt=xml as a
> parameter while hitting the url.
>
> Thanks,
>
> On Fri, Aug 7, 2020 at 10:31 AM Erick Erickson 
> wrote:
>
> > Please show us the _exact_ URL you’re sending as well as the response
> > header, particularly the echoed params.
> >
> > This is a long shot, but also take a look at any “initParams” sections in
> > solrconfig.xml. The “wt” parameter you’ve specified in your select
> handler
> > should override anything in the  section of initParams. But
> > you’re handler is specifying wt in the defualts section, if your
> initParams
> > have the json wt specified in an invariants section that would control.
> >
> > I also recommend you look at your solrconfig through the admin UI, that
> > insures that you’re looking at the same solrconfig that your collection
> is
> > actually using. Then check your collections/ to
> > double check that your collection is using the configset you think it is.
> > This latter assumes SolrCloud.
> >
> > This is likely something in your configurations that is not as you
> expect.
> >
> > Best,
> > Erick
> >
> > > On Aug 7, 2020, at 10:19 AM, yaswanth kumar 
> > wrote:
> > >
> > > Thanks Shawn, for looking into this.
> > >
> > > I did make sure that no explicit parameter wt is being sent and also
> > > verified the logs and even that's not showing up any extra parameters.
> > But
> > > it's always taking json as a default, unless I pass it explicitly as
> > wt=xml
> > > which I don't want to do it here. Is there something else that I need
> to
> > do
> > > ?
> > >
> > > On Fri, Aug 7, 2020 at 4:23 AM Shawn Heisey 
> wrote:
> > >
> > >> How are you sending the query request that doesn't come back as xml? I
> > >> suspect that the request is being sent with an explicit wt parameter
> > set to
> > >> something other than xml. Making a query with the admin ui would do
> > this,
> > >> and it would probably default to json.
> > >>
> > >> When you make a query, assuming you haven't changed the logging
> config,
> > >> every parameter in that request can be found in the log entry for the
> > >> query, including those that come from the solrconfig.xml.
> > >>
> > >> Sorry about the top posted reply. It's the only option on this email
> > app.
> > >> My computer isn't available so I'm on my phone.
> > >>
> > >> ⁣Get TypeApp for Android ​
> > >>
> > >> On Aug 6, 2020, 21:52, at 21:52, yaswanth kumar <
> yaswanth...@gmail.com>
> > >> wrote:
> > >>> Can someone help me on this ASAP? I am using solr 8.2.0 and below is
> > >>> the
> > >>> snippet from solrconfig.xml for one of the configset, where I am
> trying
> > >>> to
> > >>> default the results into xml format but its giving me as a json
> result.
> > >>>
> > >>> 
> > >>>   
> > >>>   
> > >>> all
> > >>> 10
> > >>> 
> > >>>pkey
> > >>>xml
> > >>>   
> > >>>
> > >>> Can some one let me know if I need to do something more to always
> get a
> > >>> solr /select query results as XML??
> > >>> --
> > >>> Thanks & Regards,
> > >>> Yaswanth Kumar Konathala.
> > >>> yaswanth...@gmail.com
> > >>
> > >>
> > >
> > > --
> > > Thanks & Regards,
> > > Yaswanth Kumar Konathala.
> > > yaswanth...@gmail.com
> >
> >
>
> --
> Thanks & Regards,
> Yaswanth Kumar Konathala.
> yaswanth...@gmail.com
>


Re: Add custom comparator for field(s) for Sorting.

2020-08-07 Thread Erick Erickson
You can sort by function, see: 
https://lucene.apache.org/solr/guide/6_6/function-queries.html. Not quite sure 
what the function would look like though.

What I would do rather than a custom sort or sort by function is normalize 
these into a different field at index time and just sort on that. You might be 
able to do that with a copyField and clever tokenization, or you could populate 
the field during the ETL process.

The normalized field would have to be clever enough to prefix pure numerics 
with a bunch of zeros though, something like z1 otherwise 
you’d be sorting the numerics lexically, and 100 would come before 2.

A third alternative is to populate two separate fields, one numeric and one 
string then specify primary and secondary fields rather than one field, 
something like
 string_field asc, numeric_field asc
You’d have to pay attention to sortMissingFirst/Last on the string field, you’d 
probably want sortMissingLast.

Best,
Erick


> On Aug 7, 2020, at 12:47 PM, Pushkar Raste  wrote:
> 
> Hi,
> Is it possible to add a custom comparator to a field for sorting. e.g.
> let's say I have field 'name' and following documents
> 
> {
>  id : "doc1",
>  name : "1"
> }
> 
> 
> {
>  id : "doc2",
>  name : "S1"
> }
> 
> 
> {
>  id : "doc2",
>  name : "S2"
> }
> 
> if I sort using field 'name', the order would be : ["doc1", "doc2", "doc3"]
> but I want pure numbers to last and want the order ["doc2", "doc3",
> "doc1"]. Is there a way I can provide my own comparator?



Add custom comparator for field(s) for Sorting.

2020-08-07 Thread Pushkar Raste
Hi,
Is it possible to add a custom comparator to a field for sorting. e.g.
let's say I have field 'name' and following documents

{
  id : "doc1",
  name : "1"
}


{
  id : "doc2",
  name : "S1"
}


{
  id : "doc2",
  name : "S2"
}

if I sort using field 'name', the order would be : ["doc1", "doc2", "doc3"]
but I want pure numbers to last and want the order ["doc2", "doc3",
"doc1"]. Is there a way I can provide my own comparator?


Re: Solr + Parquets

2020-08-07 Thread Jörn Franke
DIH is deprecated and it will be removed from Solr. You may though still be 
able to install it as a plug-in. However, AFAIK nobody maintains it. Do not use 
it anymore

You can write a custom Spark data source that writes to Solr or does it in a 
spark Map step using SolrJ .
In both cases do not create 100s of executors to avoid overloading.


> Am 07.08.2020 um 18:39 schrieb Kevin Van Lieshout :
> 
> Hi,
> 
> Is there any assistance around writing parquets from spark to solr shards
> or is it possible to customize a DIH to import a parquet to a solr shard.
> Let me know if this is possible, or the best work around for this. Much
> appreciated, thanks
> 
> 
> Kevin VL


Solr + Parquets

2020-08-07 Thread Kevin Van Lieshout
Hi,

Is there any assistance around writing parquets from spark to solr shards
or is it possible to customize a DIH to import a parquet to a solr shard.
Let me know if this is possible, or the best work around for this. Much
appreciated, thanks


Kevin VL


Re: copyField from empty multivalue

2020-08-07 Thread matthew sporleder
Nevermind I think we found this was caused by a bug in our (new) custom indexer

On Thu, Aug 6, 2020 at 4:11 PM matthew sporleder  wrote:
>
> I have a copyField:
>  
>  
>
> But sometimes preview ( indexed="true" stored="true" multiValued="true" />) is not populated.
>
> It appears that the "catchall" field does not get created when preview
> has no content in it.  Can I use required=false or similar on a
> copyField?
>
> Thanks,
> Matt


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread yaswanth kumar
Thanks for looking into this Erick,

solr/PROXIMITY_DATA_V2/select?q=pkey:223_*=true=country_en=country_en

that's what the url I am hitting, and also I made sure that initParams is
all commented like this and also I made sure that there is no uncommneted
section defined for initParams.

Also from the solrcloud I did make sure that I am checking the correct
collection and verified solrconfig.xml by choosing the collection and
browsing files within the same collection.

What ever I am trying is not working other than sending wt=xml as a
parameter while hitting the url.

Thanks,

On Fri, Aug 7, 2020 at 10:31 AM Erick Erickson 
wrote:

> Please show us the _exact_ URL you’re sending as well as the response
> header, particularly the echoed params.
>
> This is a long shot, but also take a look at any “initParams” sections in
> solrconfig.xml. The “wt” parameter you’ve specified in your select handler
> should override anything in the  section of initParams. But
> you’re handler is specifying wt in the defualts section, if your initParams
> have the json wt specified in an invariants section that would control.
>
> I also recommend you look at your solrconfig through the admin UI, that
> insures that you’re looking at the same solrconfig that your collection is
> actually using. Then check your collections/ to
> double check that your collection is using the configset you think it is.
> This latter assumes SolrCloud.
>
> This is likely something in your configurations that is not as you expect.
>
> Best,
> Erick
>
> > On Aug 7, 2020, at 10:19 AM, yaswanth kumar 
> wrote:
> >
> > Thanks Shawn, for looking into this.
> >
> > I did make sure that no explicit parameter wt is being sent and also
> > verified the logs and even that's not showing up any extra parameters.
> But
> > it's always taking json as a default, unless I pass it explicitly as
> wt=xml
> > which I don't want to do it here. Is there something else that I need to
> do
> > ?
> >
> > On Fri, Aug 7, 2020 at 4:23 AM Shawn Heisey  wrote:
> >
> >> How are you sending the query request that doesn't come back as xml? I
> >> suspect that the request is being sent with an explicit wt parameter
> set to
> >> something other than xml. Making a query with the admin ui would do
> this,
> >> and it would probably default to json.
> >>
> >> When you make a query, assuming you haven't changed the logging config,
> >> every parameter in that request can be found in the log entry for the
> >> query, including those that come from the solrconfig.xml.
> >>
> >> Sorry about the top posted reply. It's the only option on this email
> app.
> >> My computer isn't available so I'm on my phone.
> >>
> >> ⁣Get TypeApp for Android ​
> >>
> >> On Aug 6, 2020, 21:52, at 21:52, yaswanth kumar 
> >> wrote:
> >>> Can someone help me on this ASAP? I am using solr 8.2.0 and below is
> >>> the
> >>> snippet from solrconfig.xml for one of the configset, where I am trying
> >>> to
> >>> default the results into xml format but its giving me as a json result.
> >>>
> >>> 
> >>>   
> >>>   
> >>> all
> >>> 10
> >>> 
> >>>pkey
> >>>xml
> >>>   
> >>>
> >>> Can some one let me know if I need to do something more to always get a
> >>> solr /select query results as XML??
> >>> --
> >>> Thanks & Regards,
> >>> Yaswanth Kumar Konathala.
> >>> yaswanth...@gmail.com
> >>
> >>
> >
> > --
> > Thanks & Regards,
> > Yaswanth Kumar Konathala.
> > yaswanth...@gmail.com
>
>

-- 
Thanks & Regards,
Yaswanth Kumar Konathala.
yaswanth...@gmail.com


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread Erick Erickson
Please show us the _exact_ URL you’re sending as well as the response header, 
particularly the echoed params.

This is a long shot, but also take a look at any “initParams” sections in 
solrconfig.xml. The “wt” parameter you’ve specified in your select handler 
should override anything in the  section of initParams. But you’re 
handler is specifying wt in the defualts section, if your initParams have the 
json wt specified in an invariants section that would control.

I also recommend you look at your solrconfig through the admin UI, that insures 
that you’re looking at the same solrconfig that your collection is actually 
using. Then check your collections/ to double check 
that your collection is using the configset you think it is. This latter 
assumes SolrCloud.

This is likely something in your configurations that is not as you expect.

Best,
Erick

> On Aug 7, 2020, at 10:19 AM, yaswanth kumar  wrote:
> 
> Thanks Shawn, for looking into this.
> 
> I did make sure that no explicit parameter wt is being sent and also
> verified the logs and even that's not showing up any extra parameters. But
> it's always taking json as a default, unless I pass it explicitly as wt=xml
> which I don't want to do it here. Is there something else that I need to do
> ?
> 
> On Fri, Aug 7, 2020 at 4:23 AM Shawn Heisey  wrote:
> 
>> How are you sending the query request that doesn't come back as xml? I
>> suspect that the request is being sent with an explicit wt parameter set to
>> something other than xml. Making a query with the admin ui would do this,
>> and it would probably default to json.
>> 
>> When you make a query, assuming you haven't changed the logging config,
>> every parameter in that request can be found in the log entry for the
>> query, including those that come from the solrconfig.xml.
>> 
>> Sorry about the top posted reply. It's the only option on this email app.
>> My computer isn't available so I'm on my phone.
>> 
>> ⁣Get TypeApp for Android ​
>> 
>> On Aug 6, 2020, 21:52, at 21:52, yaswanth kumar 
>> wrote:
>>> Can someone help me on this ASAP? I am using solr 8.2.0 and below is
>>> the
>>> snippet from solrconfig.xml for one of the configset, where I am trying
>>> to
>>> default the results into xml format but its giving me as a json result.
>>> 
>>> 
>>>   
>>>   
>>> all
>>> 10
>>> 
>>>pkey
>>>xml
>>>   
>>> 
>>> Can some one let me know if I need to do something more to always get a
>>> solr /select query results as XML??
>>> --
>>> Thanks & Regards,
>>> Yaswanth Kumar Konathala.
>>> yaswanth...@gmail.com
>> 
>> 
> 
> -- 
> Thanks & Regards,
> Yaswanth Kumar Konathala.
> yaswanth...@gmail.com



Re: Replication of Solr Model and feature store

2020-08-07 Thread krishan goyal
Hi Monica,

Replication is working fine for me. You just have to add the
_schema_feature-store.json and _schema_model-store.json to confFiles under
/replication in solrconfig.xml

I think the issue you are seeing is where the model is referencing a
feature which is not present in the feature store. Or the feature weights
for the model are incorrect. The issue in solr is that it doesn't return
you the right exception but throws a model not found exception

Try these ways to fix it
1. verify feature weights are < 1. I am not sure why having weights > 1 is
an issue but apparently it is in some random cases
2. verify all features used in the model file _schema_model-store.json are
actually present in the feature file _schema_feature-store.json.

Another issue with solr LTR is if you have a corrupt model/feature file,
you can't update/delete it via the API in some cases. you would need to
change the respective _schema_model-store.json
and _schema_feature-store.json files and reload the cores for the changes
to take effect.

Please try these and let me know if the issue still exists

On Thu, Aug 6, 2020 at 11:18 PM Monica Skidmore <
monica.skidm...@careerbuilder.com> wrote:

> I would be interested in the answer here, as well.  We're using LTR
> successfully on Solr 7.3 and Solr 8.3 in cloud mode, but we're struggling
> to load a simple, test model on 8.3 in master/slave mode.   The
> FeatureStore appears to load, but we're not sure it's loading correctly,
> either. Here are some details from the engineer on our team who is leading
> that effort:
>
> "I'm getting a ClassCastException when uploading a Model. Using the
> debugger, was able to see the line throwing the exception is:
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:488)
>
> Apparently it cannot find: org.apache.solr.ltr.model.LinearModel, although
> the features appear to be created without issues with the following class:
> org.apache.solr.ltr.feature.FieldValueFeature
>
> Another thing we were able to see is that the List features has a
> list of null elements, so that made us think there may be some issues when
> creating the instances of Feature.
>
> We had begun to believe this might be related to the fact that we are
> running Solr in Master/Slave config. Was LTR ever tested on non-cloud
> deployments??
>
> Any help is appreciated."
>
> Monica D Skidmore
> Lead Engineer, Core Search
>
>
>
> CareerBuilder.com  | Blog <
> https://www.careerbuilder.com/advice> | Press Room <
> https://press.careerbuilder.com/>
>
>
>
>
> On 7/24/20, 7:58 AM, "Christine Poerschke (BLOOMBERG/ LONDON)" <
> cpoersc...@bloomberg.net> wrote:
>
> Hi Krishan,
>
> Could you share what version of Solr you are using?
>
> And I wonder if the observed behaviour could be reproduced e.g. with
> the techproducts example, changes not applying after reload [1] sounds like
> a bug if so.
>
> Hope that helps.
>
> Regards,
>
> Christine
>
> [1]
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_6%2Flearning-to-rank.html%23applying-changesdata=01%7C01%7CMonica.Skidmore%40careerbuilder.com%7C65581e5e79414c90832508d82fc8ce21%7C7cc1677566a34e8b80fd5b1f1db15061%7C0sdata=mMqgPhnkjb8h7ETQNaySOBJQ8x%2FP2dtzM%2FgSE1K1FZg%3Dreserved=0
>
> From: solr-user@lucene.apache.org At: 07/22/20 14:00:59To:
> solr-user@lucene.apache.org
> Subject: Re: Replication of Solr Model and feature store
>
> Adding more details here
>
> I need some help on how to enable the solr LTR model and features on
> all
> nodes of a solr cluster.
>
> I am unable to replicate the model and the feature store though from
> any
> master to its slaves with the replication API ? And unable to find any
> documentation for the same. Is replication possible?
>
> Without replication, would I have to individually update all nodes of a
> cluster ? Or can the feature and model files be read as a resource
> (like
> config or schema) so that I can replicate the file or add the file to
> my
> deployments.
>
>
> On Wed, Jul 22, 2020 at 5:53 PM krishan goyal 
> wrote:
>
> > Bump. Any one has an idea how to proceed here ?
> >
> > On Wed, Jul 8, 2020 at 5:41 PM krishan goyal 
> > wrote:
> >
> >> Hi,
> >>
> >> How do I enable replication of the model and feature store ?
> >>
> >> Thanks
> >> Krishan
> >>
> >
>
>
>
>


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread yaswanth kumar
Thanks Shawn, for looking into this.

I did make sure that no explicit parameter wt is being sent and also
verified the logs and even that's not showing up any extra parameters. But
it's always taking json as a default, unless I pass it explicitly as wt=xml
which I don't want to do it here. Is there something else that I need to do
?

On Fri, Aug 7, 2020 at 4:23 AM Shawn Heisey  wrote:

> How are you sending the query request that doesn't come back as xml? I
> suspect that the request is being sent with an explicit wt parameter set to
> something other than xml. Making a query with the admin ui would do this,
> and it would probably default to json.
>
> When you make a query, assuming you haven't changed the logging config,
> every parameter in that request can be found in the log entry for the
> query, including those that come from the solrconfig.xml.
>
> Sorry about the top posted reply. It's the only option on this email app.
> My computer isn't available so I'm on my phone.
>
> ⁣Get TypeApp for Android ​
>
> On Aug 6, 2020, 21:52, at 21:52, yaswanth kumar 
> wrote:
> >Can someone help me on this ASAP? I am using solr 8.2.0 and below is
> >the
> >snippet from solrconfig.xml for one of the configset, where I am trying
> >to
> >default the results into xml format but its giving me as a json result.
> >
> >
> >
> >
> >  all
> >  10
> >  
> > pkey
> > xml
> >
> >
> >Can some one let me know if I need to do something more to always get a
> >solr /select query results as XML??
> >--
> >Thanks & Regards,
> >Yaswanth Kumar Konathala.
> >yaswanth...@gmail.com
>
>

-- 
Thanks & Regards,
Yaswanth Kumar Konathala.
yaswanth...@gmail.com


Re: Cybersecurity Incident Report

2020-08-07 Thread Jan Høydahl
If you suspect a new vulnerability in the product, please report as detailed on 
our security page:
https://lucene.apache.org/solr/security.html

For these existing ones, you may first check whether upgrades are already done 
in 8.5 or 8.6, and if not,
check if there is an open JIRA issue about upgrading these dependencies, and if 
not kindly open a new JIRA
issue about such upgrades. And if you are willing to contribute, a patch or PR 
is highly welcome too :)

Jan

> 7. aug. 2020 kl. 05:03 skrev Man with No Name :
> 
> You’re absolutely right. Some of these are shadow jars and sone directly
> used. Like netty, we're securing the communication using tls and the netty
> cve applies.
> 
> So going back to the initial question, what would be the best way to
> report this, so that it can be looked at?
> 
> On Fri, Jul 24, 2020 at 7:35 PM Shawn Heisey  wrote:
> 
>> On 7/24/2020 2:35 PM, Man with No Name wrote:
>>> This version of jackson is pulled in as a shadow jar. Also solr is using
>>> io.netty version 4.1.29.Final which has critical vulnerabilities which
>>> are fixed in 4.1.44.
>> 
>> It looks like that shaded jackson library is included in the jar for
>> htrace.  I looked through the commit history and learned that htrace is
>> included for the HDFS support in Solr.  Which means that if you are not
>> using the HDFS capability, then htrace will not be used, so the older
>> jackson library will not be used either.
>> 
>> If you are not using TLS connections from SolrCloud to ZooKeeper, then
>> your install of Solr will not be using the netty library, and
>> vulnerabilities in netty will not apply.
>> 
>> The older version of Guava is pulled in with a jar from carrot2.  If
>> your Solr install does not use carrot2 clustering, then that version of
>> Guava will never be called.
>> 
>> The commons-compress and tika libraries are only used if you have
>> configured the extraction contrib, also known as SolrCell.  This contrib
>> module is used to index rich-text documents, such as PDF and Word.
>> Because it makes Solr unstable, we strongly recommend that nobody should
>> use SolrCell in production.  When rich-text documents need to be
>> indexed, it should be accomplished by using Tika outside of Solr... and
>> if that recommendation is followed, you can control the version used so
>> that the well-known vulnerabilities will not be present.
>> 
>> We have always recommended that Solr should be located in a network
>> place that can only be reached by systems and people who are authorized.
>>  If that is done, then nobody will be able to exploit any
>> vulnerabilities that might exist in Solr unless they first successfully
>> break into an authorized system.
>> 
>> We do take these reports of vulnerabilities seriously and close them as
>> quickly as we can.
>> 
>> Thanks,
>> Shawn
>> 
> -- 
> Sent from Gmail for IPhone



Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Andy Webb
hi Jörn - something's decoding a UTF8 sequence using the legacy iso-8859-1
character set:

Jörn is J%C3%B6rn in UTF8
J%C3%B6rn misinterpreted as iso-8859-1 is Jörn
Jörn is J%C3%83%C2%B6rn in UTF8

I hope this helps track down the problem!
Andy

On Fri, 7 Aug 2020 at 12:08, Jörn Franke  wrote:

> Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
> which component of the application screws it up, but at the moment I do NOT
> believe it is related to Solrj.
>
> On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:
>
> > Dear all,
> >
> > I have the following issues. I have a Solrj Client 8.6 (but it happens
> > also in previous versions), where I execute, for example, the following
> > query:
> > Jörn
> >
> > If I look into Solr Admin UI it finds all the right results.
> >
> > If I use Solrj client then it does not find anything.
> > Further, investigating in debug mode it seems that the URI to server gets
> > wrongly encoded.
> > Jörn becomes J%C3%83%C2%B6rn
> > It should become only J%C3%B6rn
> > any idea why this happens and why it add %83%C2 inbetween? Those do not
> > seem to be even valid UTF-8 characters
> >
> > I verified with various statements that I give to Solrj the correct
> > encoded String "Jörn"
> >
> > Can anyone help me here?
> >
> > Thank you.
> >
> > best regards
> >
>


Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Jörn Franke
Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
which component of the application screws it up, but at the moment I do NOT
believe it is related to Solrj.

On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:

> Dear all,
>
> I have the following issues. I have a Solrj Client 8.6 (but it happens
> also in previous versions), where I execute, for example, the following
> query:
> Jörn
>
> If I look into Solr Admin UI it finds all the right results.
>
> If I use Solrj client then it does not find anything.
> Further, investigating in debug mode it seems that the URI to server gets
> wrongly encoded.
> Jörn becomes J%C3%83%C2%B6rn
> It should become only J%C3%B6rn
> any idea why this happens and why it add %83%C2 inbetween? Those do not
> seem to be even valid UTF-8 characters
>
> I verified with various statements that I give to Solrj the correct
> encoded String "Jörn"
>
> Can anyone help me here?
>
> Thank you.
>
> best regards
>


Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Jörn Franke
Dear all,

I have the following issues. I have a Solrj Client 8.6 (but it happens also
in previous versions), where I execute, for example, the following query:
Jörn

If I look into Solr Admin UI it finds all the right results.

If I use Solrj client then it does not find anything.
Further, investigating in debug mode it seems that the URI to server gets
wrongly encoded.
Jörn becomes J%C3%83%C2%B6rn
It should become only J%C3%B6rn
any idea why this happens and why it add %83%C2 inbetween? Those do not
seem to be even valid UTF-8 characters

I verified with various statements that I give to Solrj the correct encoded
String "Jörn"

Can anyone help me here?

Thank you.

best regards


Re: wt=xml not defaulting the results to xml format

2020-08-07 Thread Shawn Heisey
How are you sending the query request that doesn't come back as xml? I suspect 
that the request is being sent with an explicit wt parameter set to something 
other than xml. Making a query with the admin ui would do this, and it would 
probably default to json.

When you make a query, assuming you haven't changed the logging config, every 
parameter in that request can be found in the log entry for the query, 
including those that come from the solrconfig.xml.

Sorry about the top posted reply. It's the only option on this email app. My 
computer isn't available so I'm on my phone.

⁣Get TypeApp for Android ​

On Aug 6, 2020, 21:52, at 21:52, yaswanth kumar  wrote:
>Can someone help me on this ASAP? I am using solr 8.2.0 and below is
>the
>snippet from solrconfig.xml for one of the configset, where I am trying
>to
>default the results into xml format but its giving me as a json result.
>
>
>
>
>  all
>  10
>  
> pkey
> xml
>
>
>Can some one let me know if I need to do something more to always get a
>solr /select query results as XML??
>--
>Thanks & Regards,
>Yaswanth Kumar Konathala.
>yaswanth...@gmail.com