Re: SolrJ: SolrInputDocument.addField()

2021-02-14 Thread Shawn Heisey

On 2/14/2021 9:00 AM, Steven White wrote:

It looks like I'm misusing SolrJ API  SolrInputDocument.addField() thus I
need clarification.

Here is an example of what I have in my code:

 SolrInputDocument doc = new SolrInputDocument();
 doc.addField("MyFieldOne", "some data");
 doc.addField("MyFieldTwo", 100);

The above code is creating 2 fields for me (if they don't exist already)
and then indexing the data to those fields.  The data is "some data" and
the number 100  However, when the field is created, it is not using the
field type that I custom created in my schema.  My question is, how do I
tell addField() to use my custom field type?


There is no way in SolrJ code to control which fieldType is used.  That 
is controlled solely by the server-side schema definition.


How do you know that Solr is not using the correct fieldType?  If you 
are looking at the documents returned by a search and aren't seeing the 
transformations described in the schema, you're looking in the wrong place.


Solr search results always returns what was originally sent in for 
indexing.  Only Update Processors (defined in solrconfig.xml, not the 
schema) can affect what gets returned in results, fieldType definitions 
NEVER affect data returned in search results.


Thanks,
Shawn


Re: Down Replica is elected as Leader (solr v8.7.0)

2021-02-14 Thread mmb1234


We found that for the shard that does not get a leader, the tlog replay did
not complete (we don't see "log replay finished", "creating leader
registration node", "I am the new leader" etc log messages) for hours.

Also not sure why the TLOG are 10's of GBs (anywhere from 30 to 40GB). 

Collection's shards have 3x replicas, TLOG replication and 10sec hard
commit.

The situation is resulting in 2x per day outage. Current work around is to
stop ingestion, issue a collection rebalance and/or node restarts and repeat
until shards are online (a couple of hrs each day of manual recovery).

Any suggestions or workarounds?

Not sure if we're running into these issues:
https://issues.apache.org/jira/browse/SOLR-13486
https://issues.apache.org/jira/browse/SOLR-14679


Partial log output from both nodes (sorted by time asc):

myapp-data-solr-0
2021-02-12 19:36:05.965 INFO (zkCallback-14-thread-51) [c:mydata
s:0_8000-9fff r:core_node3 x:mydata_0_8000-9fff_replica_t1]
o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader


myapp-data-solr-0 
2021-02-12 19:36:05.966 WARN 
(recoveryExecutor-96-thread-1-processing-n:myapp-data-solr-0.myapp-data-solr-headless:8983_solr
x:mydata_0_8000-9fff_replica_t1 c:mydata s:0_8000-9fff
r:core_node3) [c:mydata s:0_8000-9fff r:core_node3
x:mydata_0_8000-9fff_replica_t1] o.a.s.u.UpdateLog Starting log
replay
tlog{file=/opt/solr/volumes/data1/mydata_0_8000-9fff/tlog/tlog.0003525
refcount=2}  active=false starting pos=0 inSortedOrder=true


myapp-data-solr-0 
2021-02-12 22:13:03.084 INFO 
(recoveryExecutor-96-thread-1-processing-n:myapp-data-solr-0.myapp-data-solr-headless:8983_solr
x:mydata_0_8000-9fff_replica_t1 c:mydata s:0_8000-9fff
r:core_node3) [c:mydata s:0_8000-9fff r:core_node3
x:mydata_0_8000-9fff_replica_t1] o.a.s.u.UpdateLog log replay status
tlog{file=/opt/solr/volumes/data1/mydata_0_8000-9fff/tlog/tlog.0003525
refcount=3} active=false starting pos=0 current pos=27101174167 current
size=33357447222 % read=81.0


myapp-data-solr-0
2021-02-12 22:13:06.602 ERROR (indexFetcher-4092-thread-1) [ ]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw
state=DocCollection(mydata//collections/mydata/state.json/750)={
"pullReplicas":"0", "replicationFactor":"0", "shards":{
"0_8000-9fff":{ "range":null, "state":"active", "replicas":{
"core_node3":{ "core":"mydata_0_8000-9fff_replica_t1",
"base_url":"http://myapp-data-solr-0.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-0.myapp-data-solr-headless:8983_solr",
"state":"active", "type":"TLOG", "force_set_state":"false"}, "core_node5":{
"core":"mydata_0_8000-9fff_replica_t2",
"base_url":"http://myapp-data-solr-1.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-1.myapp-data-solr-headless:8983_solr",
"state":"active", "type":"TLOG", "force_set_state":"false",
"property.preferredleader":"true"}, "core_node6":{
"core":"mydata_0_8000-9fff_replica_t4",
"base_url":"http://myapp-data-solr-2.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-2.myapp-data-solr-headless:8983_solr",
"state":"down", "type":"TLOG", "force_set_state":"false"}}},


myapp-data-solr-0
2021-02-12 22:45:51.600 ERROR (indexFetcher-4092-thread-1) [ ]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw
state=DocCollection(mydata//collections/mydata/state.json/754)={
"pullReplicas":"0", "replicationFactor":"0", "shards":{
"0_8000-9fff":{ "range":null, "state":"active", "replicas":{
"core_node3":{ "core":"mydata_0_8000-9fff_replica_t1",
"base_url":"http://myapp-data-solr-0.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-0.myapp-data-solr-headless:8983_solr",
"state":"active", "type":"TLOG", "force_set_state":"false"}, "core_node5":{
"core":"mydata_0_8000-9fff_replica_t2",
"base_url":"http://myapp-data-solr-1.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-1.myapp-data-solr-headless:8983_solr",
"state":"down", "type":"TLOG", "force_set_state":"false",
"property.preferredleader":"true"}, "core_node6":{
"core":"mydata_0_8000-9fff_replica_t4",
"base_url":"http://myapp-data-solr-2.myapp-data-solr-headless:8983/solr;,
"node_name":"myapp-data-solr-2.myapp-data-solr-headless:8983_solr",
"state":"down", "type":"TLOG", "force_set_state":"false"}}},...


myapp-data-solr-1
2021-02-12 22:45:56.600 ERROR (indexFetcher-4092-thread-1) [ ]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread David Smiley
Hello Ronen,

Can you please file a JIRA issue?  Some quick searches did not turn
anything up.  It would be super helpful to me if you could list a series of
steps with Solr out-of-the-box in 8.8 including what data to index and
query.  Solr already includes the "tech products" sample data; maybe that
can illustrate the problem?  It's not clear if nested schema or nested docs
are actually required in your example.  If you share the JIRA issue with
me, I'll chase this one down.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Feb 14, 2021 at 11:16 AM Ronen Nussbaum  wrote:

> Hi All,
>
> I discovered a strange behaviour with this combination.
> Not only the atomic update fails, the child documents are not properly
> indexed, and you can't use highlights on their text fields. Currently there
> is no workaround other than reindex.
>
> Checked on 8.3.0, 8.6.1 and 8.8.0.
> 1. Configure nested schema.
> 2. enableLazyFieldLoading is true (default).
> 3. Run a search with hl.method=unified and hl.fl=
> 4. Trying to do an atomic update on some of the *parents* of the returned
> documents from #3.
>
> You get an error: "TransactionLog doesn't know how to serialize class
> org.apache.lucene.document.LazyDocument$LazyField".
>
> Now trying to run #3 again yields an error message that the text field is
> indexed without positions.
>
> If enableLazyFieldLoading is false or if using the default highlighter this
> doesn't happen.
>
> Ronen.
>


Re: Asymmetric Key Size not sufficient

2021-02-14 Thread Mahir Kabir
Hi,

Thanks for letting me know.

Best,
Mahir

On Sun, Feb 14, 2021, 9:08 AM Mike Drob  wrote:

> Future vulnerability reports should be sent to secur...@apache.org so
> that they can be resolved privately.
>
> Thank you
>
> On Fri, Feb 12, 2021 at 10:17 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Recent versions of Solr use 2048.
>>
>> https://github.com/apache/lucene-solr/blob/branch_8_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java#L332
>>
>> Thanks for your report.
>>
>> On Fri, Feb 12, 2021 at 3:44 PM Mahir Kabir  wrote:
>>
>> > Hello,
>> >
>> > I am a Ph.D. student at Virginia Tech, USA. While working on a security
>> > project-related work, we came across the following vulnerability in the
>> > source code -
>> >
>> > In file
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/branch_6_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java
>> > <
>> >
>> https://github.com/apache/ranger/blob/71e1dd40366c8eb8e9c498b0b5158d85d603af02/kms/src/main/java/org/apache/hadoop/crypto/key/RangerKeyStore.java
>> > >
>> > (at
>> > Line 300) Key Size was set as 1024.
>> >
>> > *Security Impact*:
>> >
>> > < 2048 key size for RSA algorithm makes the system vulnerable to
>> > brute-force attack
>> >
>> > *Useful resource*:
>> > https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
>> > https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
>> >
>> > *Solution we suggest*:
>> >
>> > For RSA algorithm, the key size should be >= 2048
>> >
>> > *Please share with us your opinions/comments if there is any*:
>> >
>> > Is the bug report helpful?
>> >
>> > Please let us know what you think about the issue. Any feedback will be
>> > appreciated.
>> >
>> > Thank you,
>> > Md Mahir Asef Kabir
>> > Ph.D. Student
>> > Department of CS
>> > Virginia Tech
>> >
>>
>


Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread Ronen Nussbaum
Hi All,

I discovered a strange behaviour with this combination.
Not only the atomic update fails, the child documents are not properly
indexed, and you can't use highlights on their text fields. Currently there
is no workaround other than reindex.

Checked on 8.3.0, 8.6.1 and 8.8.0.
1. Configure nested schema.
2. enableLazyFieldLoading is true (default).
3. Run a search with hl.method=unified and hl.fl=
4. Trying to do an atomic update on some of the *parents* of the returned
documents from #3.

You get an error: "TransactionLog doesn't know how to serialize class
org.apache.lucene.document.LazyDocument$LazyField".

Now trying to run #3 again yields an error message that the text field is
indexed without positions.

If enableLazyFieldLoading is false or if using the default highlighter this
doesn't happen.

Ronen.


SolrJ: SolrInputDocument.addField()

2021-02-14 Thread Steven White
Hi everyone,

It looks like I'm misusing SolrJ API  SolrInputDocument.addField() thus I
need clarification.

Here is an example of what I have in my code:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("MyFieldOne", "some data");
doc.addField("MyFieldTwo", 100);

The above code is creating 2 fields for me (if they don't exist already)
and then indexing the data to those fields.  The data is "some data" and
the number 100  However, when the field is created, it is not using the
field type that I custom created in my schema.  My question is, how do I
tell addField() to use my custom field type?

I _think_ I have to first SolrInputDocument.createField() and then call
SolrInputDocument.addField()?  Or is the process of indexing data into a
field done via some other API I overlooked?

I need some guidance to make sure I get the logic right.

Thanks.

Steven


Re: Asymmetric Key Size not sufficient

2021-02-14 Thread Mike Drob
Future vulnerability reports should be sent to secur...@apache.org so that
they can be resolved privately.

Thank you

On Fri, Feb 12, 2021 at 10:17 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Recent versions of Solr use 2048.
>
> https://github.com/apache/lucene-solr/blob/branch_8_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java#L332
>
> Thanks for your report.
>
> On Fri, Feb 12, 2021 at 3:44 PM Mahir Kabir  wrote:
>
> > Hello,
> >
> > I am a Ph.D. student at Virginia Tech, USA. While working on a security
> > project-related work, we came across the following vulnerability in the
> > source code -
> >
> > In file
> >
> >
> https://github.com/apache/lucene-solr/blob/branch_6_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java
> > <
> >
> https://github.com/apache/ranger/blob/71e1dd40366c8eb8e9c498b0b5158d85d603af02/kms/src/main/java/org/apache/hadoop/crypto/key/RangerKeyStore.java
> > >
> > (at
> > Line 300) Key Size was set as 1024.
> >
> > *Security Impact*:
> >
> > < 2048 key size for RSA algorithm makes the system vulnerable to
> > brute-force attack
> >
> > *Useful resource*:
> > https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
> > https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
> >
> > *Solution we suggest*:
> >
> > For RSA algorithm, the key size should be >= 2048
> >
> > *Please share with us your opinions/comments if there is any*:
> >
> > Is the bug report helpful?
> >
> > Please let us know what you think about the issue. Any feedback will be
> > appreciated.
> >
> > Thank you,
> > Md Mahir Asef Kabir
> > Ph.D. Student
> > Department of CS
> > Virginia Tech
> >
>