[
https://issues.apache.org/jira/browse/SOLR-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685642#comment-13685642
]
Mark Miller commented on SOLR-4916:
-----------------------------------
The Patch:
* *An HdfsDirectory implementation that uses a BlockDirectory to cache
(read/write) hdfs blocks.*
The default index codec currently supports append only filesystems, so impl is
fairly straightforward and effective. It would be interesting if we could
easily tell if a codec was append only.
* *An HdfsDirectoryFactory to hook this into Solr.*
Now that Directory is a first class citizen in Solr, allows pretty much
everything to work on hdfs with few other tweaks, including Replication.
Adds a new option to DirectoryFactory to have Searchers explicitly reserve
commits points - no delete on last close like unix and no delete while in use
fails like windows.
* *An HdfsUpdateLog that allows writing the transaction log to hdfs as well.*
I talked to Yonik a while back and I think we are in agreement that we don't
want to currently support making a pluggable UpdateLog - so this one is built
in and triggers on using an hdfs:// prefixed update log path.
* *An HdfsLockFactory.*
Simple impl to write lock files to hdfs rather than the local filesystem.
* *SOLR-4655 Overseer should assign node names*
Includes the work for SOLR-4566 - while a good general improvement, this is
also important for this patch because we use the node name in hdfs paths - if a
different machine takes over for that path, it's awkward to have the address
for another machine as part of it.
* *Tests*
There a few new tests specifically written for HDFS. There are also a bunch of
new tests that simply run the current pertinent SolrCloud tests against hdfs.
Because the SolrCloud tests are already so long, on a slower machine, this can
greatly increase the test run time. It's actually almost no noticeable slow
down on my 6 core machine, but it's pretty awful on my 2 core machine. To deal
with this, in my patch, I have made the tests that are functionally equivalent
to current tests but run against hdfs, only run nightly.
> Add support to write and read Solr index files and transaction log files to
> and from HDFS.
> ------------------------------------------------------------------------------------------
>
> Key: SOLR-4916
> URL: https://issues.apache.org/jira/browse/SOLR-4916
> Project: Solr
> Issue Type: New Feature
> Reporter: Mark Miller
> Assignee: Mark Miller
> Attachments: SOLR-4916.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]