[ 
https://issues.apache.org/jira/browse/SOLR-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685642#comment-13685642
 ] 

Mark Miller commented on SOLR-4916:
-----------------------------------


The Patch:

* *An HdfsDirectory implementation that uses a BlockDirectory to cache 
(read/write) hdfs blocks.*

The default index codec currently supports append only filesystems, so impl is 
fairly straightforward and effective. It would be interesting if we could 
easily tell if a codec was append only.

* *An HdfsDirectoryFactory to hook this into Solr.*

Now that Directory is a first class citizen in Solr, allows pretty much 
everything to work on hdfs with few other tweaks, including Replication. 
Adds a new option to DirectoryFactory to have Searchers explicitly reserve 
commits points - no delete on last close like unix and no delete while in use 
fails like windows.

* *An HdfsUpdateLog that allows writing the transaction log to hdfs as well.* 

I talked to Yonik a while back and I think we are in agreement that we don't 
want to currently support making a pluggable UpdateLog - so this one is built 
in and triggers on using an hdfs:// prefixed update log path.

* *An HdfsLockFactory.*

Simple impl to write lock files to hdfs rather than the local filesystem.

* *SOLR-4655 Overseer should assign node names*

Includes the work for SOLR-4566 - while a good general improvement, this is 
also important for this patch because we use the node name in hdfs paths - if a 
different machine takes over for that path, it's awkward to have the address 
for another machine as part of it.

* *Tests*

There a few new tests specifically written for HDFS. There are also a bunch of 
new tests that simply run the current pertinent SolrCloud tests against hdfs. 
Because the SolrCloud tests are already so long, on a slower machine, this can 
greatly increase the test run time. It's actually almost no noticeable slow 
down on my 6 core machine, but it's pretty awful on my 2 core machine. To deal 
with this, in my patch, I have made the tests that are functionally equivalent 
to current tests but run against hdfs, only run nightly.

                
> Add support to write and read Solr index files and transaction log files to 
> and from HDFS.
> ------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4916
>                 URL: https://issues.apache.org/jira/browse/SOLR-4916
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>         Attachments: SOLR-4916.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to