[jira] [Commented] (SOLR-7954) ArrayIndexOutOfBoundsException from distributed HLL serialization logic when using using stats.field={!cardinality=1.0} in a distributed query

2016-01-07 Thread Modassar Ather (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15087105#comment-15087105
 ] 

Modassar Ather commented on SOLR-7954:
--

{noformat}q=fl1:net*=fl=50=true={!cardinality=1.0}fl{noformat}
  
Above query is returning cardinality around 15 million. It is taking around 4 
minutes. Similar response time is seen with different queries which yields high 
cardinality. Kindly note that the cardinality=1.0 is the desired goal.
Here in the above example the fl1 is a text field whereas fl is a docValue 
enabled non-stroed, non-indexed field.
Kindly let me know if such response time is expected or I am missing something 
about this feature in my query.

> ArrayIndexOutOfBoundsException from distributed HLL serialization logic when 
> using using stats.field={!cardinality=1.0} in a distributed query
> --
>
> Key: SOLR-7954
> URL: https://issues.apache.org/jira/browse/SOLR-7954
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.2.1
> Environment: SolrCloud 4 node cluster.
> Ubuntu 12.04
> OS Type 64 bit
>Reporter: Modassar Ather
>Assignee: Hoss Man
> Fix For: 5.4, Trunk
>
> Attachments: SOLR-7954.patch, SOLR-7954.patch, SOLR-7954.patch
>
>
> User reports indicate that using {{stats.field=\{!cardinality=1.0\}foo}} on a 
> field that has extremely high cardinality on a single shard (example: 150K 
> unique values) can lead to "ArrayIndexOutOfBoundsException: 3" on the shard 
> during serialization of the HLL values.
> using "cardinality=0.9" (or lower) doesn't produce the same symptoms, 
> suggesting the problem is specific to large log2m and regwidth values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7954) ArrayIndexOutOfBoundsException from distributed HLL serialization logic when using using stats.field={!cardinality=1.0} in a distributed query

2015-08-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715188#comment-14715188
 ] 

ASF subversion and git services commented on SOLR-7954:
---

Commit 1697977 from hoss...@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1697977 ]

SOLR-7954: Fixed an integer overflow bug in the HyperLogLog code used by the 
'cardinality' option of stats.field to prevent ArrayIndexOutOfBoundsException 
in a distributed search when a large precision is selected and a large number 
of values exist in each shard (merge r1697969)

 ArrayIndexOutOfBoundsException from distributed HLL serialization logic when 
 using using stats.field={!cardinality=1.0} in a distributed query
 --

 Key: SOLR-7954
 URL: https://issues.apache.org/jira/browse/SOLR-7954
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.2.1
 Environment: SolrCloud 4 node cluster.
 Ubuntu 12.04
 OS Type 64 bit
Reporter: Modassar Ather
Assignee: Hoss Man
 Attachments: SOLR-7954.patch, SOLR-7954.patch, SOLR-7954.patch


 User reports indicate that using {{stats.field=\{!cardinality=1.0\}foo}} on a 
 field that has extremely high cardinality on a single shard (example: 150K 
 unique values) can lead to ArrayIndexOutOfBoundsException: 3 on the shard 
 during serialization of the HLL values.
 using cardinality=0.9 (or lower) doesn't produce the same symptoms, 
 suggesting the problem is specific to large log2m and regwidth values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7954) ArrayIndexOutOfBoundsException from distributed HLL serialization logic when using using stats.field={!cardinality=1.0} in a distributed query

2015-08-26 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714473#comment-14714473
 ] 

ASF subversion and git services commented on SOLR-7954:
---

Commit 1697969 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1697969 ]

SOLR-7954: Fixed an integer overflow bug in the HyperLogLog code used by the 
'cardinality' option of stats.field to prevent ArrayIndexOutOfBoundsException 
in a distributed search when a large precision is selected and a large number 
of values exist in each shard

 ArrayIndexOutOfBoundsException from distributed HLL serialization logic when 
 using using stats.field={!cardinality=1.0} in a distributed query
 --

 Key: SOLR-7954
 URL: https://issues.apache.org/jira/browse/SOLR-7954
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.2.1
 Environment: SolrCloud 4 node cluster.
 Ubuntu 12.04
 OS Type 64 bit
Reporter: Modassar Ather
Assignee: Hoss Man
 Attachments: SOLR-7954.patch, SOLR-7954.patch, SOLR-7954.patch


 User reports indicate that using {{stats.field=\{!cardinality=1.0\}foo}} on a 
 field that has extremely high cardinality on a single shard (example: 150K 
 unique values) can lead to ArrayIndexOutOfBoundsException: 3 on the shard 
 during serialization of the HLL values.
 using cardinality=0.9 (or lower) doesn't produce the same symptoms, 
 suggesting the problem is specific to large log2m and regwidth values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7954) ArrayIndexOutOfBoundsException from distributed HLL serialization logic when using using stats.field={!cardinality=1.0} in a distributed query

2015-08-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711783#comment-14711783
 ] 

Hoss Man commented on SOLR-7954:


bq. Later I indexed 40 documents on which I could not reproduce it. All the 
shards had around 10 documents each.
bq. There are 4 shards with no replica on my test environment.

Modassar: as i tried to explain in my earlier comments, the number of shards / 
documents doesn't really affect the issue -- the root problem has to do with 
the number of unique _values_ in a single shard which are added to the 
underlying HyperLogLog data structure and then serialized.  Doing more testing 
where you tweak the routing or doc counts may find _differnet_ bugs, but for 
this specific bug the core problem is reviewing the HLL serialization code 
related to the various precision options (which are set based on the 
cardinality local param) and the number of unique (hashed) values in each HLL.

 ArrayIndexOutOfBoundsException from distributed HLL serialization logic when 
 using using stats.field={!cardinality=1.0} in a distributed query
 --

 Key: SOLR-7954
 URL: https://issues.apache.org/jira/browse/SOLR-7954
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.2.1
 Environment: SolrCloud 4 node cluster.
 Ubuntu 12.04
 OS Type 64 bit
Reporter: Modassar Ather
Assignee: Hoss Man
 Attachments: SOLR-7954.patch, SOLR-7954.patch


 User reports indicate that using {{stats.field=\{!cardinality=1.0\}foo}} on a 
 field that has extremely high cardinality on a single shard (example: 150K 
 unique values) can lead to ArrayIndexOutOfBoundsException: 3 on the shard 
 during serialization of the HLL values.
 using cardinality=0.9 (or lower) doesn't produce the same symptoms, 
 suggesting the problem is specific to large log2m and regwidth values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7954) ArrayIndexOutOfBoundsException from distributed HLL serialization logic when using using stats.field={!cardinality=1.0} in a distributed query

2015-08-24 Thread Modassar Ather (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710599#comment-14710599
 ] 

Modassar Ather commented on SOLR-7954:
--

To add to the summary and description.

I changed the {noformat}doc.addField(colid, val!+i+!-+ref+i);{noformat} 
to {noformat}doc.addField(colid, val+i+!-+ref+i);{noformat}
The documents got distributed to all the nodes. I indexed 1 million documents 
and was able to reproduce the issue. All the shards had around 20 documents 
each.
Later I indexed 40 documents on which I could not reproduce it. All the 
shards had around 10 documents each.
There are 4 shards with no replica on my test environment.

 ArrayIndexOutOfBoundsException from distributed HLL serialization logic when 
 using using stats.field={!cardinality=1.0} in a distributed query
 --

 Key: SOLR-7954
 URL: https://issues.apache.org/jira/browse/SOLR-7954
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.2.1
 Environment: SolrCloud 4 node cluster.
 Ubuntu 12.04
 OS Type 64 bit
Reporter: Modassar Ather
Assignee: Hoss Man
 Attachments: SOLR-7954.patch


 User reports indicate that using {{stats.field=\{!cardinality=1.0\}foo}} on a 
 field that has extremely high cardinality on a single shard (example: 150K 
 unique values) can lead to ArrayIndexOutOfBoundsException: 3 on the shard 
 during serialization of the HLL values.
 using cardinality=0.9 (or lower) doesn't produce the same symptoms, 
 suggesting the problem is specific to large log2m and regwidth values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org