[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user markrmiller commented on the pull request: https://github.com/apache/lucene-solr/pull/34#issuecomment-217559403 We only have the needs of an hdfs client for shipping. I filed an issue to shrink that. Making a whole new contrib, extending the test running time, making the integration require more configuration and less just working, needing to be configurable instead of being able to take advantage of core integration...sorry, I'm against the change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user romseygeek commented on the pull request: https://github.com/apache/lucene-solr/pull/34#issuecomment-217531618 @markrmiller I don't think we need to make people jump through any more hoops, and HDFS integration would stay as part of the core distribution, but moving it into a contrib allows us to offer a slimmed-down distro for people who aren't using hadoop. Plus it helps keep us honest when it comes to encapsulation and plugs a few leaky abstractions in the core. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user markrmiller commented on the pull request: https://github.com/apache/lucene-solr/pull/34#issuecomment-217233476 I filed https://issues.apache.org/jira/browse/SOLR-9075 to look at shrinking the hdfs client dependency jars. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user markrmiller commented on the pull request: https://github.com/apache/lucene-solr/pull/34#issuecomment-217231828 I'm not currently for this change. HDFS is currently built in and supported first class. I don't see a need to make anyone that wants to use it jump any more hoops to save a few dependency jars. I'm still looking at making the first class integration better rather than separating things further. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user romseygeek commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/34#discussion_r62206298 --- Diff: solr/core/src/java/org/apache/solr/update/UpdateHandler.java --- @@ -200,4 +187,16 @@ public void registerOptimizeCallback( SolrEventListener listener ) } public abstract void split(SplitIndexCommand cmd) throws IOException; + + private static UpdateLog initialisePluginUpdateLog(SolrCore core, String dataDir, PluginInfo ulogPluginInfo) + { +String className = ulogPluginInfo.className; +if (System.getProperty("test.hdfs.forceHdfsUpdateLog") != null) { + className = "solr.HdfsUpdateLog"; +} +if (className != null) { + return core.getResourceLoader().newInstance(className, UpdateLog.class); +} +return new UpdateLog(); + } } --- End diff -- I wonder if a nicer way of doing this would be to make UpdateLog be created by the DirectoryFactory? Something like: ``` DirectoryFactory { public UpdateLog newUpdateLog(SolrCore core, String dataDir, PluginInfo ulogPluginInfo) { if (ulogPluginInfo.className == null) return new UpdateLog(); return core.getResourceLoader().newInstance(ulogPluginInfo.className, UpdateLog.class); } } HdfsDirectoryFactory { @Override public UpdateLog newUpdateLog( ... ) { if (ulogPluginInfo.className == null) { return new HdfsUpdateLog(); } return super.newUpdateLog( ... ); } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user romseygeek commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/34#discussion_r62205397 --- Diff: solr/core/src/test/org/apache/solr/cloud/ShardSplitTest.java --- @@ -57,7 +57,6 @@ import static org.apache.solr.common.cloud.ZkStateReader.MAX_SHARDS_PER_NODE; import static org.apache.solr.common.cloud.ZkStateReader.REPLICATION_FACTOR; -@Slow --- End diff -- I don't think this should be here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user romseygeek commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/34#discussion_r62204058 --- Diff: solr/core/ivy.xml --- @@ -61,15 +61,6 @@ - - - - - --- End diff -- Do you know what requires auth in core? It would be nice to move *all* the hadoop jars out --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user romseygeek commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/34#discussion_r62203539 --- Diff: lucene/core/src/java/org/apache/lucene/store/Directory.java --- @@ -165,4 +165,13 @@ public void copyFrom(Directory from, String src, String dest, IOContext context) * @throws AlreadyClosedException if this Directory is closed */ protected void ensureOpen() throws AlreadyClosedException {} + --- End diff -- I think we should avoid changing any lucene classes for the moment - fileModified() can probably stay where it is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
Github user dsmiley commented on the pull request: https://github.com/apache/lucene-solr/pull/34#issuecomment-216055389 I'm +1 to the overall notion of this but I haven't reviewed the code. I'm surprised Hadoop dependencies made it into Solr-core in the first place. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib
GitHub user tomjon opened a pull request: https://github.com/apache/lucene-solr/pull/34 Move hdfs stuff out into a new contrib An attempt to move hdfs/Hadoop related classes out of core SOLR into a new contrib, and reduce the size of SOLR core by thus removing some of the Hadoop JARs. Notes, in no particular order: * tests that are sub-classed to create new hdfs tests are turned into BaseTest classes in test-framework, and a test sub-class added in core/test and hdfs/test * dependencies on hadoop-auth and hadoop-minikdc not moved in contrib (these JARs are small anyway) * a couple of uses of org.apache.hadoop.fs.Path in core replaced by String manipulations * should blockcache package be moved in its entirety or is this not only used by hdfs? * to get an HdfsUpdateLog you now need to specify this class in solrconfig.xml in the UpdateLog config, instead of this happening automatically by having an hdfs:/ prefix to the data directory. This means for the hdfs tests, which reuse the test-files from core, we use a system property to force the behaviour. Is there a better way? * could move BadHdfsThreadsFilter into hdfs contrib but then would have to make morphines depend on hdfs for this one class, in any case, BadHdfsThreadsFilter does not depend on any Hadoop classes * map-reduce contrib now depends on hdfs contrib You can merge this pull request into a Git repository by running: $ git pull https://github.com/tomjon/lucene-solr solr-hdfs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/34.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #34 commit 22c85a06b04267337bf5f1bbb2ace0abddb87080 Author: Tom WinchDate: 2016-04-28T15:15:49Z Move hdfs stuff out into a new contrib --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org