[ 
https://issues.apache.org/jira/browse/SOLR-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438178#comment-16438178
 ] 

Boris Pasko commented on SOLR-6305:
-----------------------------------

Unfortunately, it seems that the patch I did does not cover all usecases. It 
might be that merging is still done with server-side replication factor:
{noformat}
$ hadoop fs -du -h /solr/classified/core_node4/data70.0 G   209.9 G  
/solr/classified/core_node4/data/index.20180411213103577
70.0 G   209.9 G  /solr/classified/core_node4/data/index.20180411213103577
78       78       /solr/classified/core_node4/data/index.properties
210      210      /solr/classified/core_node4/data/replication.properties
0        0        /solr/classified/core_node4/data/snapshot_metadata
915.3 M  1.8 G    /solr/classified/core_node4/data/tlog
{noformat}
and
{noformat}
$ hadoop fs -ls /solr/classified/core_node2/data/index
-rwxr-xr-x   1 solr solr        418 2018-04-13 21:25 
/solr/classified/core_node2/data/index/_13ke.si
-rwxr-xr-x   3 solr solr  663715968 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_3fp.fdt
-rwxr-xr-x   3 solr solr     517308 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp.fdx
-rwxr-xr-x   3 solr solr       3638 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp.fnm
-rwxr-xr-x   3 solr solr   25644767 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp.nvd
-rwxr-xr-x   3 solr solr        178 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp.nvm
-rwxr-xr-x   3 solr solr        522 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_3fp.si
-rwxr-xr-x   1 solr solr     356244 2018-04-12 08:14 
/solr/classified/core_node2/data/index/_3fp_9v.liv
-rwxr-xr-x   3 solr solr 1634072760 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp_Lucene50_0.doc
-rwxr-xr-x   3 solr solr 2698137408 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_3fp_Lucene50_0.pos
-rwxr-xr-x   3 solr solr  365912676 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp_Lucene50_0.tim
-rwxr-xr-x   3 solr solr    6024240 2018-04-11 17:21 
/solr/classified/core_node2/data/index/_3fp_Lucene50_0.tip
-rwxr-xr-x   3 solr solr  596163565 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.fdt
-rwxr-xr-x   3 solr solr     479765 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.fdx
-rwxr-xr-x   3 solr solr       3638 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.fnm
-rwxr-xr-x   3 solr solr   26688139 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.nvd
-rwxr-xr-x   3 solr solr        178 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.nvm
-rwxr-xr-x   3 solr solr        522 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi.si
-rwxr-xr-x   3 solr solr 1466093502 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi_Lucene50_0.doc
-rwxr-xr-x   3 solr solr 2374256964 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi_Lucene50_0.pos
-rwxr-xr-x   3 solr solr  345128291 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi_Lucene50_0.tim
-rwxr-xr-x   3 solr solr    5839414 2018-04-11 17:22 
/solr/classified/core_node2/data/index/_5oi_Lucene50_0.tip
-rwxr-xr-x   1 solr solr     333668 2018-04-12 04:11 
/solr/classified/core_node2/data/index/_5oi_aj.liv
{noformat}


> Ability to set the replication factor for index files created by 
> HDFSDirectoryFactory
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-6305
>                 URL: https://issues.apache.org/jira/browse/SOLR-6305
>             Project: Solr
>          Issue Type: Improvement
>          Components: hdfs
>         Environment: hadoop-2.2.0
>            Reporter: Timothy Potter
>            Priority: Major
>         Attachments: 
> 0001-OIQ-23224-SOLR-6305-Fixed-SOLR-6305-by-reading-the-r.patch
>
>
> HdfsFileWriter doesn't allow us to create files in HDFS with a different 
> replication factor than the configured DFS default because it uses:     
> {{FsServerDefaults fsDefaults = fileSystem.getServerDefaults(path);}}
> Since we have two forms of replication going on when using 
> HDFSDirectoryFactory, it would be nice to be able to set the HDFS replication 
> factor for the Solr directories to a lower value than the default. I realize 
> this might reduce the chance of data locality but since Solr cores each have 
> their own path in HDFS, we should give operators the option to reduce it.
> My original thinking was to just use Hadoop setrep to customize the 
> replication factor, but that's a one-time shot and doesn't affect new files 
> created. For instance, I did:
> {{hadoop fs -setrep -R 1 solr49/coll1}}
> My default dfs replication is set to 3 ^^ I'm setting it to 1 just as an 
> example
> Then added some more docs to the coll1 and did:
> {{hadoop fs -stat %r solr49/hdfs1/core_node1/data/index/segments_3}}
> 3 <-- should be 1
> So it looks like new files don't inherit the repfact from their parent 
> directory.
> Not sure if we need to go as far as allowing different replication factor per 
> collection but that should be considered if possible.
> I looked at the Hadoop 2.2.0 code to see if there was a way to work through 
> this using the Configuration object but nothing jumped out at me ... and the 
> implementation for getServerDefaults(path) is just:
>   public FsServerDefaults getServerDefaults(Path p) throws IOException {
>     return getServerDefaults();
>   }
> Path is ignored ;-)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to