[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-06 Thread markrmiller
Github user markrmiller commented on the pull request:

https://github.com/apache/lucene-solr/pull/34#issuecomment-217559403
  
We only have the needs of an hdfs client for shipping. I filed an issue to 
shrink that. Making a whole new contrib, extending the test running time, 
making the integration require more configuration and less just working, 
needing to be configurable instead of being able to take advantage of core 
integration...sorry, I'm against the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-06 Thread romseygeek
Github user romseygeek commented on the pull request:

https://github.com/apache/lucene-solr/pull/34#issuecomment-217531618
  
@markrmiller I don't think we need to make people jump through any more 
hoops, and HDFS integration would stay as part of the core distribution, but 
moving it into a contrib allows us to offer a slimmed-down distro for people 
who aren't using hadoop.  Plus it helps keep us honest when it comes to 
encapsulation and plugs a few leaky abstractions in the core.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread markrmiller
Github user markrmiller commented on the pull request:

https://github.com/apache/lucene-solr/pull/34#issuecomment-217233476
  
I filed https://issues.apache.org/jira/browse/SOLR-9075 to look at 
shrinking the hdfs client dependency jars.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread markrmiller
Github user markrmiller commented on the pull request:

https://github.com/apache/lucene-solr/pull/34#issuecomment-217231828
  
I'm not currently for this change. HDFS is currently built in and supported 
first class. I don't see a need to make anyone that wants to use it jump any 
more hoops to save a few dependency jars. I'm still looking at making the first 
class integration better rather than separating things further.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread romseygeek
Github user romseygeek commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/34#discussion_r62206298
  
--- Diff: solr/core/src/java/org/apache/solr/update/UpdateHandler.java ---
@@ -200,4 +187,16 @@ public void registerOptimizeCallback( 
SolrEventListener listener )
   }
 
   public abstract void split(SplitIndexCommand cmd) throws IOException;
+  
+  private static UpdateLog initialisePluginUpdateLog(SolrCore core, String 
dataDir, PluginInfo ulogPluginInfo)
+  {
+String className = ulogPluginInfo.className;
+if (System.getProperty("test.hdfs.forceHdfsUpdateLog") != null) {
+  className = "solr.HdfsUpdateLog";
+}
+if (className != null) {
+  return core.getResourceLoader().newInstance(className, 
UpdateLog.class);
+}
+return new UpdateLog();
+  }
 }
--- End diff --

I wonder if a nicer way of doing this would be to make UpdateLog be created 
by the DirectoryFactory? Something like:

```
DirectoryFactory { 
  public UpdateLog newUpdateLog(SolrCore core, String dataDir, PluginInfo 
ulogPluginInfo) {
if (ulogPluginInfo.className == null)
  return new UpdateLog();
return core.getResourceLoader().newInstance(ulogPluginInfo.className, 
UpdateLog.class);
  }
}

HdfsDirectoryFactory {
  @Override
  public UpdateLog newUpdateLog( ... ) {
if (ulogPluginInfo.className == null) {
  return new HdfsUpdateLog();
}
return super.newUpdateLog( ... );
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread romseygeek
Github user romseygeek commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/34#discussion_r62205397
  
--- Diff: solr/core/src/test/org/apache/solr/cloud/ShardSplitTest.java ---
@@ -57,7 +57,6 @@
 import static 
org.apache.solr.common.cloud.ZkStateReader.MAX_SHARDS_PER_NODE;
 import static 
org.apache.solr.common.cloud.ZkStateReader.REPLICATION_FACTOR;
 
-@Slow
--- End diff --

I don't think this should be here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread romseygeek
Github user romseygeek commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/34#discussion_r62204058
  
--- Diff: solr/core/ivy.xml ---
@@ -61,15 +61,6 @@
 
 
 
-
-
-
-
-
 
--- End diff --

Do you know what requires auth in core?  It would be nice to move *all* the 
hadoop jars out


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-05 Thread romseygeek
Github user romseygeek commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/34#discussion_r62203539
  
--- Diff: lucene/core/src/java/org/apache/lucene/store/Directory.java ---
@@ -165,4 +165,13 @@ public void copyFrom(Directory from, String src, 
String dest, IOContext context)
* @throws AlreadyClosedException if this Directory is closed
*/
   protected void ensureOpen() throws AlreadyClosedException {}
+  
--- End diff --

I think we should avoid changing any lucene classes for the moment - 
fileModified() can probably stay where it is?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-05-01 Thread dsmiley
Github user dsmiley commented on the pull request:

https://github.com/apache/lucene-solr/pull/34#issuecomment-216055389
  
I'm +1 to the overall notion of this but I haven't reviewed the code. I'm 
surprised Hadoop dependencies made it into Solr-core in the first place. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Move hdfs stuff out into a new contrib

2016-04-28 Thread tomjon
GitHub user tomjon opened a pull request:

https://github.com/apache/lucene-solr/pull/34

Move hdfs stuff out into a new contrib

An attempt to move hdfs/Hadoop related classes out of core SOLR into a new 
contrib, and reduce the size of SOLR core by thus removing some of the Hadoop 
JARs. Notes, in no particular order:

* tests that are sub-classed to create new hdfs tests are turned into 
BaseTest classes in test-framework, and a test sub-class added in core/test and 
hdfs/test
* dependencies on hadoop-auth and hadoop-minikdc not moved in contrib 
(these JARs are small anyway)
* a couple of uses of org.apache.hadoop.fs.Path in core replaced by String 
manipulations
* should blockcache package be moved in its entirety or is this not only 
used by hdfs?
* to get an HdfsUpdateLog you now need to specify this class in 
solrconfig.xml in the UpdateLog config, instead of this happening automatically 
by having an hdfs:/ prefix to the data directory. This means for the hdfs 
tests, which reuse the test-files from core, we use a system property to force 
the behaviour. Is there a better way?
* could move BadHdfsThreadsFilter into hdfs contrib but then would have to 
make morphines depend on hdfs for this one class, in any case, 
BadHdfsThreadsFilter does not depend on any Hadoop classes
* map-reduce contrib now depends on hdfs contrib

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tomjon/lucene-solr solr-hdfs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/34.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #34


commit 22c85a06b04267337bf5f1bbb2ace0abddb87080
Author: Tom Winch 
Date:   2016-04-28T15:15:49Z

Move hdfs stuff out into a new contrib




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org