[ 
https://issues.apache.org/jira/browse/SOLR-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated SOLR-7393:
------------------------------
    Description: 
When switching SolrCloud from local dataDir to HDFS directory factory indexing 
performance falls through the floor.

I've also observed very high latency on both QTime and code timer on HDFS 
writes compares to local dataDir writes (using check_solr_write.pl from 
https://github.com/harisekhon/nagios-plugins). Single test document write 
latency jumps from a few dozen milliseconds to 700-1700 millisecs, over 2000 on 
some runs.

A previous bulk online indexing job from Hive to SolrCloud that took 2 hours 
for 620M rows ended up taking a projected 20+ hours and never completing, 
usually breaking around the 16-17 hour timeframe when left overnight.

It's worth noting that I had to disable the HDFS write cache which was causing 
index corruption (SOLR-7255) on the advice of Mark Miller, who tells me this 
doesn't make much performance difference anway.

This is probably also related to SolrCloud not respecting HDFS replication 
factor, effectively making 4 copies of data instead of 2 (SOLR-6528), but that 
solely doesn't account for the massive performance drop going from vanilla 
SolrCloud to SolrCloud on HDFS HA + Kerberos.

Hari Sekhon
http://www.linkedin.com/in/harisekhon

  was:
When switching SolrCloud from local dataDir to HDFS directory factory indexing 
performance falls through the floor.

I've also observed very high latency on both QTime and code timer on HDFS 
writes compares to local dataDir writes (using check_solr_write.pl from 
https://github.com/harisekhon/nagios-plugins). Single test document write 
latency jumps from a few dozen milliseconds to 700-1700 millisecs, over 2000 on 
some runs.

A previous bulk indexing Hive to SolrCloud online indexing job that took 2 
hours for 620M rows ended up taking a projected 20+ hours and never completing, 
usually breaking around the 16-17 hour timeframe when left overnight.

It's worth noting that I had to disable the HDFS write cache which was causing 
index corruption (SOLR-7255) on the advice of Mark Miller, who tells me this 
doesn't make much performance difference anway.

This is probably also related to SolrCloud not respecting HDFS replication 
factor, effectively making 4 copies of data instead of 2 (SOLR-6528), but that 
solely doesn't account for the massive performance drop going from vanilla 
SolrCloud to SolrCloud on HDFS HA + Kerberos.

Hari Sekhon
http://www.linkedin.com/in/harisekhon


> HDFS poor indexing performance
> ------------------------------
>
>                 Key: SOLR-7393
>                 URL: https://issues.apache.org/jira/browse/SOLR-7393
>             Project: Solr
>          Issue Type: Bug
>          Components: Hadoop Integration, hdfs, SolrCloud
>    Affects Versions: 4.7.2, 4.10.3
>         Environment: HDP 2.2 / HDP Search + LucidWorks Hive SerDe
>            Reporter: Hari Sekhon
>            Priority: Critical
>
> When switching SolrCloud from local dataDir to HDFS directory factory 
> indexing performance falls through the floor.
> I've also observed very high latency on both QTime and code timer on HDFS 
> writes compares to local dataDir writes (using check_solr_write.pl from 
> https://github.com/harisekhon/nagios-plugins). Single test document write 
> latency jumps from a few dozen milliseconds to 700-1700 millisecs, over 2000 
> on some runs.
> A previous bulk online indexing job from Hive to SolrCloud that took 2 hours 
> for 620M rows ended up taking a projected 20+ hours and never completing, 
> usually breaking around the 16-17 hour timeframe when left overnight.
> It's worth noting that I had to disable the HDFS write cache which was 
> causing index corruption (SOLR-7255) on the advice of Mark Miller, who tells 
> me this doesn't make much performance difference anway.
> This is probably also related to SolrCloud not respecting HDFS replication 
> factor, effectively making 4 copies of data instead of 2 (SOLR-6528), but 
> that solely doesn't account for the massive performance drop going from 
> vanilla SolrCloud to SolrCloud on HDFS HA + Kerberos.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to