[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2022-04-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528430#comment-17528430
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on PR #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-1110297153

   @dbw9580 I was looking at this again after someone tagged me in an issue, 
but I think there is a WAY easier way to get this done.  Take a look at this PR 
that I submitted to add the Dropbox file system to Drill: 
(https://github.com/apache/drill/pull/2337).  It re-uses a lot of Drill's 
internals so that you don't have to deal with all the format readers.  




> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242948#comment-17242948
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-737693796


   > @dbw9580
   > Hi there! I hope all is well. Are you still interested in completing this 
PR?
   
   Yes. I'm currently busy with other projects, and haven't had time to look 
further into this. I remember we were having some discussions about the way 
this plugin interacts with the Drill coordinator that needs a major design 
reconsideration. When I can spare more time on this, I will continue where I 
left off.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241541#comment-17241541
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-736568269


   @dbw9580 
   Hi there!  I hope all is well.  Are you still interested in completing this 
PR?  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191198#comment-17191198
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-687702938


   The test failure looks irrelevant. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191070#comment-17191070
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r483963623



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183206#comment-17183206
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

vvysotskyi commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r475563244



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182826#comment-17182826
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

vvysotskyi commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r475253606



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181198#comment-17181198
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-677673312


   @dbw9580 
   What would happen in the following scenario.  Let's say you have user A who 
executes a query using IPFS and which spins up new Drillbits.  User B then 
decides to execute a query that does not use IPFS.  Is it possible that if 
these two queries are concurrent, could user B's query end up on the Drillbits 
for IPFS and then either not find data or cause some other problem?
   
   Alternatively, what would happen if user B executes a query while user A's 
IPFS queries are running.  What would happen if user A's query completes before 
user B?  Would it tear down the Drillbits and cause a crash?
   
   I'm asking because I really don't know here..  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181171#comment-17181171
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r473949613



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181169#comment-17181169
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r473949613



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179670#comment-17179670
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r472253999



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   Fixed in 41dce52.
   
   > Should this now work will all versions of IPFS?
   
   Probably. At least 0.4.23 and 0.6.0 that I tested with should work.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179648#comment-17179648
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r472234320



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179645#comment-17179645
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r472228486



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179628#comment-17179628
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r472205448



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179622#comment-17179622
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r472198914



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179093#comment-17179093
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

vvysotskyi commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471589662



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179088#comment-17179088
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471582499



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179084#comment-17179084
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

vvysotskyi commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471575915



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  public static final int DEFAULT_USER_PORT = 31010;
+  public static final int DEFAULT_CONTROL_PORT = 31011;
+  public static final int DEFAULT_DATA_PORT = 31012;
+  public static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private ListMultimap endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179083#comment-17179083
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674963101


   @cgivre sure, but I have to do these tomorrow (it's now midnight in my 
timezone). And maybe allow some time for the IPFS API repo to release an 
official version: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/pull/172#issuecomment-674938957
 ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179076#comment-17179076
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre edited a comment on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674957594


   @dbw9580 
   This is looking pretty good. I'm going to do a final check this evening or 
tomorrow, but can you please:
   
   1.  Squash all commits and add message of `DRILL-7745: Add storage plugin 
for IPFS` as the commit message
   2.  Go through and do a final code hygiene check (make sure there are no 
extra spaces, commented out blocks etc). Drill does have a code formatter[1] 
and just verify that the code complies with the coding standards for spacing 
and all that.  (I didn't see anything jump out at me, but it always helps to 
double check)
   3.  Please create a JIRA to add this to the public documentation.  You're 
welcome to actually add the documentation as well, but for now, let's just make 
sure we have a JIRA on file to actually add the docs. 
   
   Thanks!
   
   [1]: https://drill.apache.org/docs/apache-drill-contribution-guidelines/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179073#comment-17179073
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674957594


   @dbw9580 
   This is looking pretty good. I'm going to do a final check this evening or 
tomorrow, but can you please:
   1.  Squash all commits and add message of `DRILL-7745: Add storage plugin 
for IPFS` as the commit message
   2.  Go through and do a final code hygiene check (make sure there are no 
extra spaces, commented out blocks etc). Drill does have a code formatter[1] 
and just verify that the code complies with the coding standards for spacing 
and all that.  (I didn't see anything jump out at me, but it always helps to 
double check)
   
   Thanks!
   
   [1]: https://drill.apache.org/docs/apache-drill-contribution-guidelines/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179057#comment-17179057
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674942760


   @cgivre I think it's based on the current master already.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179056#comment-17179056
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674940667


   @dbw9580 
   Can you please rebase on current master as well?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179053#comment-17179053
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471546082



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static final int IPFS_SUB_SCAN_VALUE = 19155;

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178523#comment-17178523
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471131681



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static final int IPFS_SUB_SCAN_VALUE = 19155;

Review comment:
   One more thing... make sure that you are using the correct version(s) of 
`protoc` on your machine otherwise, the CI will reject your protobufs.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178522#comment-17178522
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r471131616



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.cid.Cid;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static final int IPFS_SUB_SCAN_VALUE = 19155;

Review comment:
   I think it's time to update the protobufs to include this value.  
   You'll need to add the `IPFS_SUB_SCAN_VALUE` here:
   
https://github.com/apache/drill/blob/0726b83d9347cbb8bd1bc64a8d10c12c1125549a/protocol/src/main/protobuf/UserBitShared.proto#L383-L385
   
   Then update the protobufs.  You can find the instructions here:
   -- https://github.com/apache/drill/tree/master/protocol
   and here for the native client.
   -- https://github.com/apache/drill/tree/master/contrib/native/client





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178518#comment-17178518
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674544534


   > 
   > Hi @dbw9580
   > Thanks for these updates. I didn't have any issues running your unit tests 
before this. However, I took a look at the Maven docs, and I'm wondering if you 
can specify the number of forks directly in the `pom.xml` file. 
[1](https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html)
   
   Thanks!! 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178512#comment-17178512
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674541877


   Hi @dbw9580 
   Thanks for these updates.  I didn't have any issues running your unit tests 
before this.  However, I took a look at the Maven docs, and I'm wondering if 
you can specify the number of forks directly in the `pom.xml` file. [1]
   
   [1]: 
https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178509#comment-17178509
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674540984


   Found that if I ran tests with `mvn clean test -DforkMode=never`, then the 
`port already in use` errors were gone. Have no idea why.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177819#comment-17177819
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470681998



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Maps;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+public static final String NAME = "ipfs";
+
+@JsonProperty
+private final String host;
+
+@JsonProperty
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")
+private final int maxNodesPerLeaf;
+
+@JsonProperty("ipfs-timeouts")
+private final Map ipfsTimeouts;
+
+@JsonIgnore
+private static final Map ipfsTimeoutDefaults = 
ImmutableMap.of(
+IPFSTimeOut.FIND_PROV, 4,
+IPFSTimeOut.FIND_PEER_INFO, 4,
+IPFSTimeOut.FETCH_DATA, 6
+);
+
+public enum IPFSTimeOut {
+@JsonProperty("find-provider")
+FIND_PROV("find-provider"),
+@JsonProperty("find-peer-info")
+FIND_PEER_INFO("find-peer-info"),
+@JsonProperty("fetch-data")
+FETCH_DATA("fetch-data");
+
+@JsonProperty("type")
+private final String which;
+IPFSTimeOut(String which) {
+this.which = which;
+}
+
+@JsonCreator
+public static IPFSTimeOut of(String which) {
+switch (which) {
+case "find-provider":
+return FIND_PROV;
+case "find-peer-info":
+return FIND_PEER_INFO;
+case "fetch-data":
+return FETCH_DATA;
+default:
+throw new InvalidParameterException("Unknown key for IPFS 
timeout config entry: " + which);
+}
+}
+
+@Override
+public String toString() {
+return this.which;
+}
+}
+
+@JsonProperty("groupscan-worker-threads")
+private final int numWorkerThreads;
+
+@JsonProperty
+private final Map formats;
+
+@JsonCreator
+public IPFSStoragePluginConfig(
+@JsonProperty("host") String host,
+@JsonProperty("port") int port,
+@JsonProperty("max-nodes-per-leaf") int maxNodesPerLeaf,
+@JsonProperty("ipfs-timeouts") Map ipfsTimeouts,
+@JsonProperty("groupscan-worker-threads") int numWorkerThreads,
+@JsonProperty("formats") Map formats) {
+this.host = host;
+this.port = port;
+this.maxNodesPerLeaf = maxNodesPerLeaf > 0 ? maxNodesPerLeaf : 1;
+if (ipfsTimeouts != null) {
+this.ipfsTimeouts = Maps.newHashMap();
+ipfsTimeouts.forEach(this.ipfsTimeouts::put);
+ipfsTimeoutDefaults.forEach(this.ipfsTimeouts::putIfAbsent);
+} else {
+this.ipfsTimeouts = ipfsTimeoutDefaults;
+}
+this.numWorkerThreads = numWorkerThreads > 0 ? numWorkerThreads : 1;
+this.formats = formats;
+}
+
+public String getHost() {
+return host;
+}
+
+public int getPort() {
+return port;
+}
+
+public int getMaxNodesPerLeaf() {
+return maxNodesPerLeaf;
+}
+
+public int 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177811#comment-17177811
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470676371



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSScanSpec.java
##
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+
+@JsonTypeName("IPFSScanSpec")
+public class IPFSScanSpec {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSScanSpec.class);
+
+  public enum Prefix {
+@JsonProperty("ipfs")
+IPFS("ipfs"),
+@JsonProperty("ipns")
+IPNS("ipns");
+
+@JsonProperty("prefix")
+private final String name;
+Prefix(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Prefix of(String what) {
+  switch (what) {
+case "ipfs" :
+  return IPFS;
+case "ipns":
+  return IPNS;
+default:
+  throw new InvalidParameterException("Unsupported prefix: " + what);
+  }
+}
+  }
+
+  public enum Format {
+@JsonProperty("json")
+JSON("json"),
+@JsonProperty("csv")
+CSV("csv");
+
+@JsonProperty("format")
+private final String name;
+Format(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Format of(String what) {
+  switch (what) {
+case "json" :
+  return JSON;
+case "csv":
+  return CSV;
+default:
+  throw new InvalidParameterException("Unsupported format: " + what);
+  }
+}
+  }
+
+  public static Set formats = ImmutableSet.of("json", "csv");
+  private Prefix prefix;
+  private String path;
+  private Format formatExtension;
+  private final IPFSContext ipfsContext;
+
+  @JsonCreator
+  public IPFSScanSpec (@JacksonInject StoragePluginRegistry registry,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("prefix") Prefix prefix,
+   @JsonProperty("format") Format format,
+   @JsonProperty("path") String path) {
+this.ipfsContext = registry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext();
+this.prefix = prefix;
+this.formatExtension = format;
+this.path = path;
+  }
+
+  public IPFSScanSpec (IPFSContext ipfsContext, String path) {
+this.ipfsContext = ipfsContext;
+parsePath(path);
+  }
+
+  private void parsePath(String path) {
+//FIXME: IPFS hashes are actually Base58 encoded, so "0" "O" "I" "l" are 
not valid

Review comment:
   Removed in d2ea637.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
>

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177814#comment-17177814
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470676806



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSScanSpec.java
##
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+
+@JsonTypeName("IPFSScanSpec")
+public class IPFSScanSpec {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSScanSpec.class);
+
+  public enum Prefix {
+@JsonProperty("ipfs")
+IPFS("ipfs"),
+@JsonProperty("ipns")
+IPNS("ipns");
+
+@JsonProperty("prefix")
+private final String name;
+Prefix(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Prefix of(String what) {
+  switch (what) {
+case "ipfs" :
+  return IPFS;
+case "ipns":
+  return IPNS;
+default:
+  throw new InvalidParameterException("Unsupported prefix: " + what);
+  }
+}
+  }
+
+  public enum Format {
+@JsonProperty("json")
+JSON("json"),
+@JsonProperty("csv")
+CSV("csv");
+
+@JsonProperty("format")
+private final String name;
+Format(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Format of(String what) {
+  switch (what) {
+case "json" :
+  return JSON;
+case "csv":
+  return CSV;
+default:
+  throw new InvalidParameterException("Unsupported format: " + what);
+  }
+}
+  }
+
+  public static Set formats = ImmutableSet.of("json", "csv");
+  private Prefix prefix;
+  private String path;
+  private Format formatExtension;
+  private final IPFSContext ipfsContext;
+
+  @JsonCreator
+  public IPFSScanSpec (@JacksonInject StoragePluginRegistry registry,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("prefix") Prefix prefix,
+   @JsonProperty("format") Format format,
+   @JsonProperty("path") String path) {
+this.ipfsContext = registry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext();
+this.prefix = prefix;
+this.formatExtension = format;
+this.path = path;
+  }
+
+  public IPFSScanSpec (IPFSContext ipfsContext, String path) {
+this.ipfsContext = ipfsContext;
+parsePath(path);
+  }
+
+  private void parsePath(String path) {
+//FIXME: IPFS hashes are actually Base58 encoded, so "0" "O" "I" "l" are 
not valid
+//also CIDs can be encoded with different encodings, not necessarily Base58
+Pattern tableNamePattern = 
Pattern.compile("^/(ipfs|ipns)/([A-Za-z0-9]{46}(/[^#]+)*)(?:#(\\w+))?$");
+Matcher matcher = tableNamePattern.matcher(path);
+if (!matcher.matches()) {
+  throw UserException.validationError().message("Invalid IPFS path in 
query string. Use paths of pattern `/scheme/hashpath#format`, where scheme:= 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177810#comment-17177810
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470676009



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177809#comment-17177809
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470675247



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177808#comment-17177808
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470675112



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177804#comment-17177804
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470673620



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177802#comment-17177802
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470673442



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177801#comment-17177801
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470673138



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  /**
+   * As defined in 
https://github.com/libp2p/go-libp2p-core/blob/b77fd280f2bfcce22f10a000e8e1d9ec53c47049/routing/query.go#L16
+   */
+  public enum DHTQueryEventType {
+// Sending a query to a peer.
+SendingQuery,
+// Got a response from a peer.
+PeerResponse,
+// Found a "closest" peer (not currently used).
+FinalPeer,
+// Got an error when querying.
+QueryError,
+// Found a provider.
+Provider,
+// Found a value.
+Value,
+// Adding a peer to the query.
+AddingPeer,
+// Dialing a peer.
+DialingPeer;
+  }
+
+  public class DHT {
+/**
+ * Find internet addresses of a given peer.
+ * See https://docs.ipfs.io/reference/http/api/#api-v0-dht-findpeer;>dht/findpeer
 in IPFS doc.
+ * @param id the id of the peer to query
+ * @param timeout timeout value in seconds
+ * @param executor executor
+ * @return List of Multiaddresses of the peer
+ */
+public List findpeerListTimeout(Multihash id, int timeout, 
ExecutorService executor) {
+  AtomicReference> ret = new AtomicReference<>();
+  timeLimitedExec(
+  "name/resolve?arg=" + id,
+  

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1716#comment-1716
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674079153


   @cgivre 
   I tried to set the ports to their default values in c090a43, but it did not 
seem to do the trick. Why is that?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177757#comment-17177757
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674062414


   @dbw9580 
   The unit tests are passing now on my machine.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177736#comment-17177736
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674047192


   > @dbw9580 I believe Drill does support connections from IPv6 sockets. There 
was a recent PR for this in fact: (#1857) but I'm not sure if that is directly 
relevant.
   > Were you able to get it working?
   
   I don't see Drill binding to any IPv6 address in `ss -6ltnp`. I blocked IPv6 
addresses in 9494a30 and the tests are now passing (most of the time, due to 
https://github.com/apache/drill/pull/2084#issuecomment-674045895).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177730#comment-17177730
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674045895


   The `connection rejected: /127.0.0.1:31011` failure was because sometimes 
Drill does not bind to the default ports (`31010, 31011, 31012`). It can bind 
to later ports like `31013, 31014, 31015`, hence the connection was rejected.
   
   I believe the reason Drill didn't bind to the default ports is that those 
ports was used by the process from the last test run and had not been recycled 
by the system. If I wait for a minute or two before starting another round of 
testing, it's likely the test will pass. 
   
   This is part of DRILL-7754, but I haven't come up with a plan to reliably 
store the ports info in IPFS.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177726#comment-17177726
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-674043258


   @dbw9580 I believe Drill does support connections from IPv6 sockets.  There 
was a recent PR for this in fact: (https://github.com/apache/drill/pull/1857) 
but I'm not sure if that is directly relevant. 
   Were you able to get it working?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177621#comment-17177621
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673976246


   @cgivre Does Drill support connections from IPv6 sockets? Is it enabled by 
default or do I have to toggle some configuration items? The "protocol family 
unavailable" error could be due to lack of support for IPv6.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177244#comment-17177244
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673636877


   @dbw9580 
   The `ClusterTest` class is supposed to start a Drill cluster so that you can 
execute queries.  You should not need to have a Drill cluster running for the 
unit tests to complete.  
   
   I think the reason this isn't doing what you're expecting is that in the 
`initIPFS` function in `IPFSTestSuit` you are creating a plugin with a null 
configuration and hence isn't initializing correctly.   
   
   I stepped through `testSimple()` with the debugger and the `dataset` object 
is `null`, hence the test fails.  My suspicion is that there is one small step 
being missed here.  Could you please take a look and step through this to make 
sure that Drill is being initialized correctly?
   Thanks
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177232#comment-17177232
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r469944365



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private IPFSContext ipfsContext;
+  private IPFSScanSpec ipfsScanSpec;
+  private IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static long DEFAULT_NODE_SIZE = 1000l;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) 
throws IOException, ExecutionSetupException {
+this(
+((IPFSStoragePlugin) 
pluginRegistry.getPlugin(ipfsStoragePluginConfig)).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+   List columns) {
+super((String) null);
+this.ipfsContext = ipfsContext;
+this.ipfsScanSpec = ipfsScanSpec;
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177213#comment-17177213
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673621711


   If I leave an instance of Drill running and then run the unit test 
(`TestIPFSQueries`), then it passes. I think the unit test does not actually 
build and  run a full Drill server, which is why the connections are rejected.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177210#comment-17177210
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673615795


   > I'm still having issues actually getting the unit tests that require the 
IPFS daemon to actually execute.
   
   @cgivre Actually I am having trouble making that test run, too. I keep 
getting errors like "connection rejected: /127.0.0.1:31011" or "Protocol family 
unavailable: /0:0:0:0:0:0:0:1:31011". I can test successfully manually through 
the web ui with drill-embedded, though.
   
   Can you try testing through the web ui, too? The simple dataset should be 
easy to add to IPFS and test:
   
   ```bash
   ipfs object patch set-data $(ipfs object new) 
   ```
   
   This will return the hash of the simple dataset, which is 
`QmcbeavnEofA6NjG7vkpe1yLJo6En6ML4JnDooDn1BbKmR`.
   
   Then run a query through the web ui: ``select * from 
ipfs.`/ipfs/QmcbeavnEofA6NjG7vkpe1yLJo6En6ML4JnDooDn1BbKmR#json` `` .
   If the query takes too long to complete, try reducing the timeout values as 
well as the `max-peers-per-leaf` value in the plugin config.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177171#comment-17177171
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470098319



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177176#comment-17177176
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470098319



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177161#comment-17177161
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470091789



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSScanSpec.java
##
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+
+@JsonTypeName("IPFSScanSpec")
+public class IPFSScanSpec {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSScanSpec.class);
+
+  public enum Prefix {
+@JsonProperty("ipfs")
+IPFS("ipfs"),
+@JsonProperty("ipns")
+IPNS("ipns");
+
+@JsonProperty("prefix")
+private final String name;
+Prefix(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Prefix of(String what) {
+  switch (what) {
+case "ipfs" :
+  return IPFS;
+case "ipns":
+  return IPNS;
+default:
+  throw new InvalidParameterException("Unsupported prefix: " + what);
+  }
+}
+  }
+
+  public enum Format {
+@JsonProperty("json")
+JSON("json"),
+@JsonProperty("csv")
+CSV("csv");
+
+@JsonProperty("format")
+private final String name;
+Format(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Format of(String what) {
+  switch (what) {
+case "json" :
+  return JSON;
+case "csv":
+  return CSV;
+default:
+  throw new InvalidParameterException("Unsupported format: " + what);
+  }
+}
+  }
+
+  public static Set formats = ImmutableSet.of("json", "csv");
+  private Prefix prefix;
+  private String path;
+  private Format formatExtension;
+  private final IPFSContext ipfsContext;
+
+  @JsonCreator
+  public IPFSScanSpec (@JacksonInject StoragePluginRegistry registry,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("prefix") Prefix prefix,
+   @JsonProperty("format") Format format,
+   @JsonProperty("path") String path) {
+this.ipfsContext = registry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext();
+this.prefix = prefix;
+this.formatExtension = format;
+this.path = path;
+  }
+
+  public IPFSScanSpec (IPFSContext ipfsContext, String path) {
+this.ipfsContext = ipfsContext;
+parsePath(path);
+  }
+
+  private void parsePath(String path) {
+//FIXME: IPFS hashes are actually Base58 encoded, so "0" "O" "I" "l" are 
not valid
+//also CIDs can be encoded with different encodings, not necessarily Base58
+Pattern tableNamePattern = 
Pattern.compile("^/(ipfs|ipns)/([A-Za-z0-9]{46}(/[^#]+)*)(?:#(\\w+))?$");
+Matcher matcher = tableNamePattern.matcher(path);
+if (!matcher.matches()) {
+  throw UserException.validationError().message("Invalid IPFS path in 
query string. Use paths of pattern `/scheme/hashpath#format`, where scheme:= 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177158#comment-17177158
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470091514



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSScanSpec.java
##
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+
+@JsonTypeName("IPFSScanSpec")
+public class IPFSScanSpec {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSScanSpec.class);
+
+  public enum Prefix {
+@JsonProperty("ipfs")
+IPFS("ipfs"),
+@JsonProperty("ipns")
+IPNS("ipns");
+
+@JsonProperty("prefix")
+private final String name;
+Prefix(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Prefix of(String what) {
+  switch (what) {
+case "ipfs" :
+  return IPFS;
+case "ipns":
+  return IPNS;
+default:
+  throw new InvalidParameterException("Unsupported prefix: " + what);
+  }
+}
+  }
+
+  public enum Format {
+@JsonProperty("json")
+JSON("json"),
+@JsonProperty("csv")
+CSV("csv");
+
+@JsonProperty("format")
+private final String name;
+Format(String prefix) {
+  this.name = prefix;
+}
+
+@Override
+public String toString() {
+  return this.name;
+}
+
+@JsonCreator
+public static Format of(String what) {
+  switch (what) {
+case "json" :
+  return JSON;
+case "csv":
+  return CSV;
+default:
+  throw new InvalidParameterException("Unsupported format: " + what);
+  }
+}
+  }
+
+  public static Set formats = ImmutableSet.of("json", "csv");
+  private Prefix prefix;
+  private String path;
+  private Format formatExtension;
+  private final IPFSContext ipfsContext;
+
+  @JsonCreator
+  public IPFSScanSpec (@JacksonInject StoragePluginRegistry registry,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("prefix") Prefix prefix,
+   @JsonProperty("format") Format format,
+   @JsonProperty("path") String path) {
+this.ipfsContext = registry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext();
+this.prefix = prefix;
+this.formatExtension = format;
+this.path = path;
+  }
+
+  public IPFSScanSpec (IPFSContext ipfsContext, String path) {
+this.ipfsContext = ipfsContext;
+parsePath(path);
+  }
+
+  private void parsePath(String path) {
+//FIXME: IPFS hashes are actually Base58 encoded, so "0" "O" "I" "l" are 
not valid

Review comment:
   Again, please either remove, or include a reference to a JIRA to 
document what needs to be done. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177154#comment-17177154
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470089240



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  /**
+   * As defined in 
https://github.com/libp2p/go-libp2p-core/blob/b77fd280f2bfcce22f10a000e8e1d9ec53c47049/routing/query.go#L16
+   */
+  public enum DHTQueryEventType {
+// Sending a query to a peer.
+SendingQuery,
+// Got a response from a peer.
+PeerResponse,
+// Found a "closest" peer (not currently used).
+FinalPeer,
+// Got an error when querying.
+QueryError,
+// Found a provider.
+Provider,
+// Found a value.
+Value,

Review comment:
   No. I included them for sake of completeness. Should they be removed?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
> 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177153#comment-17177153
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470088211



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177152#comment-17177152
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470087869



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177150#comment-17177150
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470087561



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSHelper.java
##
@@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.bouncycastle.util.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Collectors;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FETCH_DATA;
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.FIND_PEER_INFO;
+
+/**
+ * Helper class with some utilities that are specific to Drill with an IPFS 
storage
+ */
+public class IPFSHelper {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSHelper.class);
+
+  public static final String IPFS_NULL_OBJECT_HASH = 
"QmdfTbBqBPQ7VNxZEYEj14VmRuZBkqFbiwReogJgS1zR1n";
+  public static final Multihash IPFS_NULL_OBJECT = 
Multihash.fromBase58(IPFS_NULL_OBJECT_HASH);
+
+  private ExecutorService executorService;
+  private final IPFS client;
+  private final IPFSCompat clientCompat;
+  private IPFSPeer myself;
+  private int maxPeersPerLeaf;
+  private Map timeouts;
+
+  public IPFSHelper(IPFS ipfs) {
+this.client = ipfs;
+this.clientCompat = new IPFSCompat(ipfs);
+  }
+
+  public IPFSHelper(IPFS ipfs, ExecutorService executorService) {
+this(ipfs);
+this.executorService = executorService;
+  }
+
+  public void setTimeouts(Map timeouts) {
+this.timeouts = timeouts;
+  }
+
+  public void setMyself(IPFSPeer myself) {
+this.myself = myself;
+  }
+
+  /**
+   * Set maximum number of providers per leaf node. The more providers, the 
more time it takes to do DHT queries, while
+   * it is more likely we can find an optimal peer.
+   * @param maxPeersPerLeaf max number of providers to search per leaf node
+   */
+  public void setMaxPeersPerLeaf(int maxPeersPerLeaf) {
+this.maxPeersPerLeaf = maxPeersPerLeaf;
+  }
+
+  public IPFS getClient() {
+return client;
+  }
+
+  public IPFSCompat getClientCompat() {
+return clientCompat;
+  }
+
+  public List findprovsTimeout(Multihash id) {
+List providers;
+providers = clientCompat.dht.findprovsListTimeout(id, maxPeersPerLeaf, 
timeouts.get(IPFSTimeOut.FIND_PROV), executorService);
+
+return 
providers.stream().map(Multihash::fromBase58).collect(Collectors.toList());
+  }
+
+  public List findpeerTimeout(Multihash peerId) {
+// trying to resolve addresses of a node itself will always hang
+// so we treat it specially
+if(peerId.equals(myself.getId())) {
+  return myself.getMultiAddresses();
+}
+
+List addrs;
+addrs = clientCompat.dht.findpeerListTimeout(peerId, 
timeouts.get(IPFSTimeOut.FIND_PEER_INFO), executorService);
+return addrs.stream()
+.filter(addr -> !addr.equals(""))
+.map(MultiAddress::new).collect(Collectors.toList());
+  }
+
+  public byte[] getObjectDataTimeout(Multihash object) throws IOException {
+return 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177146#comment-17177146
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470086686



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+ 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177145#comment-17177145
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470085822



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+ 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177142#comment-17177142
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470084386



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+ 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177141#comment-17177141
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470083089



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,463 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private final IPFSContext ipfsContext;
+  private final IPFSScanSpec ipfsScanSpec;
+  private final IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static final long DEFAULT_NODE_SIZE = 1000L;
+  private static final int DEFAULT_USER_PORT = 31010;
+  private static final int DEFAULT_CONTROL_PORT = 31011;
+  private static final int DEFAULT_DATA_PORT = 31012;
+  private static final int DEFAULT_HTTP_PORT = 8047;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) {
+this(
+pluginRegistry.resolve(ipfsStoragePluginConfig, 
IPFSStoragePlugin.class).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+ 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177140#comment-17177140
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470082144



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  /**
+   * As defined in 
https://github.com/libp2p/go-libp2p-core/blob/b77fd280f2bfcce22f10a000e8e1d9ec53c47049/routing/query.go#L16
+   */
+  public enum DHTQueryEventType {
+// Sending a query to a peer.
+SendingQuery,
+// Got a response from a peer.
+PeerResponse,
+// Found a "closest" peer (not currently used).
+FinalPeer,
+// Got an error when querying.
+QueryError,
+// Found a provider.
+Provider,
+// Found a value.
+Value,

Review comment:
   Are `Value`, `AddingPeer` and `DialingPeer` ever used?  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177138#comment-17177138
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470081506



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  /**
+   * As defined in 
https://github.com/libp2p/go-libp2p-core/blob/b77fd280f2bfcce22f10a000e8e1d9ec53c47049/routing/query.go#L16
+   */
+  public enum DHTQueryEventType {
+// Sending a query to a peer.
+SendingQuery,
+// Got a response from a peer.
+PeerResponse,
+// Found a "closest" peer (not currently used).
+FinalPeer,
+// Got an error when querying.
+QueryError,
+// Found a provider.
+Provider,
+// Found a value.
+Value,
+// Adding a peer to the query.
+AddingPeer,
+// Dialing a peer.
+DialingPeer;
+  }
+
+  public class DHT {
+/**
+ * Find internet addresses of a given peer.
+ * See https://docs.ipfs.io/reference/http/api/#api-v0-dht-findpeer;>dht/findpeer
 in IPFS doc.
+ * @param id the id of the peer to query
+ * @param timeout timeout value in seconds
+ * @param executor executor
+ * @return List of Multiaddresses of the peer
+ */
+public List findpeerListTimeout(Multihash id, int timeout, 
ExecutorService executor) {
+  AtomicReference> ret = new AtomicReference<>();
+  timeLimitedExec(
+  "name/resolve?arg=" + id,
+  

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177137#comment-17177137
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470080197



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   I saw this PR 
(https://github.com/ipfs-shipyard/java-ipfs-http-client/pull/172) was merged!  
Can we:
   1.  Once there is a release with this PR merged, update the `pom.xml` so 
that we are using the "official" library.
   
   Should this now work will all versions of IPFS?  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177105#comment-17177105
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673552669


   @dbw9580 
   I redownloaded and it built for me.  Please disregard previous comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177103#comment-17177103
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673543986


   > `TestIPFQueries` fails the checkstyle due to unused imports.
   @cgivre Hmm I don't see any unused imports in this file and my builds are 
passing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177102#comment-17177102
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470035068



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private IPFSContext ipfsContext;
+  private IPFSScanSpec ipfsScanSpec;
+  private IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static long DEFAULT_NODE_SIZE = 1000l;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) 
throws IOException, ExecutionSetupException {
+this(
+((IPFSStoragePlugin) 
pluginRegistry.getPlugin(ipfsStoragePluginConfig)).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+   List columns) {
+super((String) null);
+this.ipfsContext = ipfsContext;
+this.ipfsScanSpec = ipfsScanSpec;
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177091#comment-17177091
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r470025016



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  public class DHT {
+/**
+ * Find internet addresses of a given peer.
+ * See https://docs.ipfs.io/reference/http/api/#api-v0-dht-findpeer;>dht/findpeer
 in IPFS doc.
+ * @param id the id of the peer to query
+ * @param timeout timeout value in seconds
+ * @param executor executor
+ * @return List of Multiaddresses of the peer
+ */
+public List findpeerListTimeout(Multihash id, int timeout, 
ExecutorService executor) {
+  AtomicReference> ret = new AtomicReference<>();
+  timeLimitedExec(
+  "name/resolve?arg=" + id,
+  timeout,
+  res -> {
+Map peer = (Map) res;

Review comment:
   Made some changes in 39bab37.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177026#comment-17177026
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r469960924



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(IPFS ipfs) {
+this(ipfs.host, ipfs.port);
+  }
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0/", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  /**
+   * Resolve names to IPFS CIDs.
+   * See https://docs.ipfs.io/reference/http/api/#api-v0-resolve;>resolve in IPFS 
doc.
+   * @param scheme the scheme of the name to resolve, usually IPFS or IPNS
+   * @param path the path to the object
+   * @param recursive whether recursively resolve names until it is a IPFS CID
+   * @return a Map of JSON object, with the result as the value of key "Path"
+   */
+  public Map resolve(String scheme, String path, boolean recursive) {
+AtomicReference ret = new AtomicReference<>();
+getObjectStream(
+"resolve?arg=/" + scheme+"/"+path +"="+recursive,
+res -> {
+  ret.set((Map) res);
+  return true;
+},
+err -> {
+  throw new RuntimeException(err);
+}
+);
+return ret.get();
+  }
+
+  public class DHT {
+/**
+ * Find internet addresses of a given peer.
+ * See https://docs.ipfs.io/reference/http/api/#api-v0-dht-findpeer;>dht/findpeer
 in IPFS doc.
+ * @param id the id of the peer to query
+ * @param timeout timeout value in seconds
+ * @param executor executor
+ * @return List of Multiaddresses of the peer
+ */
+public List findpeerListTimeout(Multihash id, int timeout, 
ExecutorService executor) {
+  AtomicReference> ret = new AtomicReference<>();
+  timeLimitedExec(
+  "name/resolve?arg=" + id,
+  timeout,
+  res -> {
+Map peer = (Map) res;

Review comment:
   I think it's unnecessary to specify all the type parameters. These 
`Map`s are JSON responses from the IPFS daemon, and can be deeply nested. It 
would be best handled by a library to properly define the types and structures 
of these responses, e.g. via DAOs, but the 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177004#comment-17177004
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r469944365



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSGroupScan.java
##
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.util.DrillVersionInfo;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.physical.EndpointAffinity;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.schedule.AffinityCreator;
+import org.apache.drill.exec.store.schedule.AssignmentCreator;
+import org.apache.drill.exec.store.schedule.CompleteWork;
+import org.apache.drill.exec.store.schedule.EndpointByteMap;
+import org.apache.drill.exec.store.schedule.EndpointByteMapImpl;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.base.Stopwatch;
+import org.apache.drill.shaded.guava.com.google.common.cache.LoadingCache;
+import 
org.apache.drill.shaded.guava.com.google.common.collect.ArrayListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ListMultimap;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Random;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.RecursiveTask;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+
+@JsonTypeName("ipfs-scan")
+public class IPFSGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSGroupScan.class);
+  private IPFSContext ipfsContext;
+  private IPFSScanSpec ipfsScanSpec;
+  private IPFSStoragePluginConfig config;
+  private List columns;
+
+  private static long DEFAULT_NODE_SIZE = 1000l;
+
+  private ListMultimap assignments;
+  private List ipfsWorkList = Lists.newArrayList();
+  private Map> endpointWorksMap;
+  private List affinities;
+
+  @JsonCreator
+  public IPFSGroupScan(@JsonProperty("IPFSScanSpec") IPFSScanSpec ipfsScanSpec,
+   @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+   @JsonProperty("columns") List columns,
+   @JacksonInject StoragePluginRegistry pluginRegistry) 
throws IOException, ExecutionSetupException {
+this(
+((IPFSStoragePlugin) 
pluginRegistry.getPlugin(ipfsStoragePluginConfig)).getIPFSContext(),
+ipfsScanSpec,
+columns
+);
+  }
+
+  public IPFSGroupScan(IPFSContext ipfsContext,
+   IPFSScanSpec ipfsScanSpec,
+   List columns) {
+super((String) null);
+this.ipfsContext = ipfsContext;
+this.ipfsScanSpec = ipfsScanSpec;
+   

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176733#comment-17176733
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-673221990


   @dbw9580 
   Please verify that the project builds and passes all checkstyles.  
`TestIPFQueries` fails the checkstyle due to unused imports.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176668#comment-17176668
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r469626333



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   Cool, let's just keep an eye on that for now. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175429#comment-17175429
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r468469590



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   Good news, it's trivial to revert to Java 8: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/pull/172.
   Let's hope it gets merged soon.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174381#comment-17174381
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r467971039



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   > Drill can't query IPFS version > 0.4.2 due to library restrictions. We 
can't simply upgrade the library because it requires Java 11 and Drill is built 
on Java 8. Is that correct?
   
   Yes, `java-ipfs-http-client` v1.2.3, the last version which requires Java 8, 
supports IPFS up to version 0.4.23, the last release before version 0.5 which 
introduced the incompatibility in 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157. The latest 
library version v.1.3.2 supports IPFS v0.5+ but requires Java 11.
   
   >How criticial would you say this is for functionality?
   
   I'm not sure how many users of IPFS have upgraded to v0.5+, but users can 
[downgrade to a previous version of 
IPFS](https://github.com/ipfs/ipfs-update#revert) if they want to run Drill 
with IPFS support for the time being. Newer IPFS versions bring performance 
improvements, which could help Drill do queries faster, but the basic 
functionalities should be the same.
   
   > Is there some workaround possible so that Drill will work with the latest 
IPFS version?
   
   At a first glance the `java-ipfs-http-client` lib seems to be using some 
features from Java 11, but only in tests. We could fork the library and revert 
the Java target version to 8 and ignore those tests. I need to investigate more 
about this to see if it's really a solution.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174354#comment-17174354
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r467938032



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   Let me make sure I understand this:
   Drill can't query IPFS version > 0.4.2 due to library restrictions.  We 
can't simply upgrade the library because it requires Java 11 and Drill is built 
on Java 8.  Is that correct?
   
   (Sorry.. not an expert on IPFS, and I just want to make sure I'm 
understanding all this.)  
   How criticial would you say this is for functionality?  Is there some 
workaround possible so that Drill will work with the latest IPFS version?
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174348#comment-17174348
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r467934049



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   Ah ok..  Seems like we need to update Drill to use a more recent version 
of Java..





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174342#comment-17174342
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r467923605



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Consumer;
+import java.util.function.Predicate;
+
+/**
+ * Compatibility fixes for java-ipfs-http-client library
+ *
+ * Supports IPFS up to version v0.4.23, due to new restrictions enforcing all 
API calls to be made with POST method.
+ * Upstream issue tracker: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/issues/157
+ */

Review comment:
   It upgraded target Java version to 11: 
https://github.com/ipfs-shipyard/java-ipfs-http-client/commit/6c0016c00b9a3cd213343fa25adb5099be52a401
   Drill's still using Java 8, I'm not sure we can do this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174054#comment-17174054
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r467667133



##
File path: 
contrib/storage-ipfs/src/test/java/org/apache/drill/exec/store/ipfs/TestIPFSGroupScan.java
##
@@ -0,0 +1,162 @@
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.api.MerkleNode;
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.categories.IPFSStorageTest;
+import org.apache.drill.categories.SlowTest;
+import org.apache.drill.shaded.guava.com.google.common.cache.CacheBuilder;
+import org.apache.drill.shaded.guava.com.google.common.cache.CacheLoader;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.shaded.guava.com.google.common.io.Resources;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.mockito.Mock;
+import org.mockito.Mockito;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+import static 
org.apache.drill.exec.store.ipfs.IPFSStoragePluginConfig.IPFSTimeOut.*;

Review comment:
   The star import is a check-style violation.  

##
File path: 
contrib/storage-ipfs/src/test/java/org/apache/drill/exec/store/ipfs/IPFSTestSuit.java
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.drill.categories.IPFSStorageTest;
+import org.apache.drill.categories.SlowTest;
+import org.apache.drill.shaded.guava.com.google.common.io.Resources;
+import org.junit.BeforeClass;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.Suite;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+
+@RunWith(Suite.class)
+@Suite.SuiteClasses({TestIPFSQueries.class, TestIPFSGroupScan.class})

Review comment:
   This is missing the scan spec test.

##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.IPFS;
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172804#comment-17172804
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-670308917


   @dbw9580 
   I'll take a look over the weekend.  Thanks for the contribution!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155095#comment-17155095
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-656479289


   > > Cleaning up the PR. I was thinking about the unit tests and it might be 
good to include unit tests using Mockito to mock up some of the various 
components. That way we can test at least some of this without the IPFS daemon. 
I can post an example if you'd like.
   > 
   > Would appreciate that.
   
   Take a look here for an example:
   
   
https://github.com/apache/drill/blob/5900cdfaae20e216d4b87795bd2efc8199e648e6/contrib/storage-elastic/src/test/java/org/apache/drill/exec/store/elasticsearch/ElasticSearchGroupScanTest.java#L42-L96
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154690#comment-17154690
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452321224



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+private static final Logger logger = 
LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;

Review comment:
   Fixed in 48d2058.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154689#comment-17154689
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452321021



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+private static final Logger logger = 
LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")
+private final int maxNodesPerLeaf;
+
+@JsonProperty("ipfs-timeouts")
+private final Map ipfsTimeouts;
+
+@JsonIgnore
+private static final Map ipfsTimeoutDefaults = 
ImmutableMap.of(
+IPFSTimeOut.FIND_PROV, 4,
+IPFSTimeOut.FIND_PEER_INFO, 4,
+IPFSTimeOut.FETCH_DATA, 6
+);
+
+public enum IPFSTimeOut {
+@JsonProperty("find-provider")
+FIND_PROV("find-provider"),
+@JsonProperty("find-peer-info")
+FIND_PEER_INFO("find-peer-info"),
+@JsonProperty("fetch-data")
+FETCH_DATA("fetch-data");
+
+@JsonProperty("type")
+private final String which;
+IPFSTimeOut(String which) {
+this.which = which;
+}
+
+@JsonCreator
+public static IPFSTimeOut of(String which) {
+switch (which) {
+case "find-provider":
+return FIND_PROV;
+case "find-peer-info":
+return FIND_PEER_INFO;
+case "fetch-data":
+return FETCH_DATA;
+default:
+throw new InvalidParameterException("Unknown key for IPFS 
timeout config entry: " + which);
+}
+}
+
+@Override
+public String toString() {
+return this.which;
+}
+}
+
+@JsonProperty("groupscan-worker-threads")
+private final int numWorkerThreads;
+
+@JsonProperty
+private final Map formats;
+
+@JsonCreator
+public IPFSStoragePluginConfig(
+@JsonProperty("host") String host,
+@JsonProperty("port") int port,
+@JsonProperty("max-nodes-per-leaf") int maxNodesPerLeaf,
+@JsonProperty("ipfs-timeouts") Map ipfsTimeouts,
+@JsonProperty("groupscan-worker-threads") int numWorkerThreads,
+@JsonProperty("formats") Map formats) {
+this.host = host;
+this.port = port;
+this.maxNodesPerLeaf = maxNodesPerLeaf > 0 ? maxNodesPerLeaf : 1;
+if (ipfsTimeouts != null) {
+ipfsTimeoutDefaults.forEach(ipfsTimeouts::putIfAbsent);
+} else {
+ipfsTimeouts = ipfsTimeoutDefaults;
+}
+this.ipfsTimeouts = ipfsTimeouts;
+this.numWorkerThreads = numWorkerThreads > 0 ? numWorkerThreads : 1;
+this.formats = formats;
+}
+
+public String getHost() {
+return host;
+}
+
+public int getPort() {
+return port;
+}
+
+public int getMaxNodesPerLeaf() {
+return maxNodesPerLeaf;
+}
+
+public int getIpfsTimeout(IPFSTimeOut which) {
+return ipfsTimeouts.get(which);
+}
+
+public Map 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154688#comment-17154688
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452320526



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,187 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+private static final Logger logger = 
LoggerFactory.getLogger(IPFSStoragePluginConfig.class);

Review comment:
   Fixed in 48d2058.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154687#comment-17154687
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452320145



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSchemaFactory.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.Table;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.store.AbstractSchema;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.SchemaFactory;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.Sets;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.Set;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ConcurrentSkipListMap;
+
+public class IPFSSchemaFactory implements SchemaFactory{
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSSchemaFactory.class);
+
+  final String schemaName;
+  final IPFSContext context;
+
+  public IPFSSchemaFactory(IPFSContext context, String name) throws 
IOException {
+this.context = context;
+this.schemaName = name;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) 
throws IOException {
+logger.debug("registerSchemas {}", schemaName);
+IPFSTables schema = new IPFSTables(schemaName);
+SchemaPlus hPlus = parent.add(schemaName, schema);
+schema.setHolder(hPlus);
+  }
+
+  class IPFSTables extends AbstractSchema {
+private Set tableNames = Sets.newHashSet();
+private final ConcurrentMap tables = new 
ConcurrentSkipListMap<>(String::compareToIgnoreCase);
+public IPFSTables (String name) {
+  super(ImmutableList.of(), name);
+  tableNames.add(name);
+}
+
+public void setHolder(SchemaPlus pulsOfThis) {
+}
+
+@Override
+public String getTypeName() {
+  return IPFSStoragePluginConfig.NAME;
+}
+
+@Override
+public Set getTableNames() {
+  return Collections.emptySet();
+}
+
+@Override
+public Table getTable(String tableName) {
+  //TODO: better handling of table names

Review comment:
   This is now DRILL-7766.

##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSchemaFactory.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.Table;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.store.AbstractSchema;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.SchemaFactory;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.Sets;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154685#comment-17154685
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452319560



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+
+/*import org.apache.drill.common.expression.SchemaPath;*/
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static int IPFS_SUB_SCAN_VALUE = 19155;
+  private final IPFSContext ipfsContext;
+  private final List ipfsSubScanSpecList;
+  private final IPFSScanSpec.Format format;
+  private final List columns;
+
+
+  @JsonCreator
+  public IPFSSubScan(@JacksonInject StoragePluginRegistry registry,
+ @JsonProperty("IPFSStoragePluginConfig") 
IPFSStoragePluginConfig ipfsStoragePluginConfig,
+ @JsonProperty("IPFSSubScanSpec") 
@JsonDeserialize(using=MultihashDeserializer.class) List 
ipfsSubScanSpecList,
+ @JsonProperty("format") IPFSScanSpec.Format format,
+ @JsonProperty("columns") List columns
+ ) throws ExecutionSetupException {
+super((String) null);
+IPFSStoragePlugin plugin = (IPFSStoragePlugin) 
registry.getPlugin(ipfsStoragePluginConfig);
+ipfsContext = plugin.getIPFSContext();
+this.ipfsSubScanSpecList = ipfsSubScanSpecList;
+this.format = format;
+this.columns = columns;
+  }
+
+  public IPFSSubScan(IPFSContext ipfsContext, List 
ipfsSubScanSpecList, IPFSScanSpec.Format format, List columns) {
+super((String) null);
+this.ipfsContext = ipfsContext;
+this.ipfsSubScanSpecList = ipfsSubScanSpecList;
+this.format = format;
+this.columns = columns;
+  }
+
+  @JsonIgnore
+  public IPFSContext getIPFSContext() {
+return ipfsContext;
+  }
+
+  @JsonProperty("IPFSStoragePluginConfig")
+  public IPFSStoragePluginConfig getIPFSStoragePluginConfig() {
+return ipfsContext.getStoragePluginConfig();
+  }
+
+  @JsonProperty("columns")
+  public List getColumns() {
+return columns;
+  }
+
+  @JsonProperty("format")
+  public IPFSScanSpec.Format getFormat() {
+return format;
+  }
+
+  @Override
+  public String toString() {
+return new PlanStringBuilder(this)
+.field("scan spec", ipfsSubScanSpecList)
+.field("format", format)

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154684#comment-17154684
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452319373



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+
+/*import org.apache.drill.common.expression.SchemaPath;*/

Review comment:
   Fixed in 0f9c2db.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154662#comment-17154662
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452297400



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+
+/*import org.apache.drill.common.expression.SchemaPath;*/
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static int IPFS_SUB_SCAN_VALUE = 19155;
+  private final IPFSContext ipfsContext;
+  private final List ipfsSubScanSpecList;

Review comment:
   Yes, which variant of `List` it is doesn't really matter, but the rest 
of the code does not rely on a specific implementation of `List`, either. I 
made a `LinkedList` instance here and that was a mistake:
   
https://github.com/apache/drill/blob/df4a7b2993e6752481d6b35d636f5fef4a20aebf/contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java#L182
   
   Should I change it to `ArrayList`? I mean using the interface as 
the type seems like the default way.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154650#comment-17154650
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452281356



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSchemaFactory.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.Table;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.store.AbstractSchema;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.SchemaFactory;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.Sets;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.Set;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ConcurrentSkipListMap;
+
+public class IPFSSchemaFactory implements SchemaFactory{
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSSchemaFactory.class);
+
+  final String schemaName;
+  final IPFSContext context;
+
+  public IPFSSchemaFactory(IPFSContext context, String name) throws 
IOException {
+this.context = context;
+this.schemaName = name;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) 
throws IOException {
+logger.debug("registerSchemas {}", schemaName);
+IPFSTables schema = new IPFSTables(schemaName);
+SchemaPlus hPlus = parent.add(schemaName, schema);
+schema.setHolder(hPlus);
+  }
+
+  class IPFSTables extends AbstractSchema {
+private Set tableNames = Sets.newHashSet();
+private final ConcurrentMap tables = new 
ConcurrentSkipListMap<>(String::compareToIgnoreCase);
+public IPFSTables (String name) {
+  super(ImmutableList.of(), name);
+  tableNames.add(name);
+}
+
+public void setHolder(SchemaPlus pulsOfThis) {
+}
+
+@Override
+public String getTypeName() {
+  return IPFSStoragePluginConfig.NAME;
+}
+
+@Override
+public Set getTableNames() {
+  return Collections.emptySet();
+}
+
+@Override
+public Table getTable(String tableName) {
+  //TODO: better handling of table names

Review comment:
   In that case, perhaps create a JIRA and reference it in the code 
comments. It's fine with me to leave the code, but just please put an 
explanation of why it's there and what the plans are. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154648#comment-17154648
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r452279329



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSchemaFactory.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.Table;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.store.AbstractSchema;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.SchemaFactory;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+import org.apache.drill.shaded.guava.com.google.common.collect.Sets;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.Set;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ConcurrentSkipListMap;
+
+public class IPFSSchemaFactory implements SchemaFactory{
+  private static final Logger logger = 
LoggerFactory.getLogger(IPFSSchemaFactory.class);
+
+  final String schemaName;
+  final IPFSContext context;
+
+  public IPFSSchemaFactory(IPFSContext context, String name) throws 
IOException {
+this.context = context;
+this.schemaName = name;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) 
throws IOException {
+logger.debug("registerSchemas {}", schemaName);
+IPFSTables schema = new IPFSTables(schemaName);
+SchemaPlus hPlus = parent.add(schemaName, schema);
+schema.setHolder(hPlus);
+  }
+
+  class IPFSTables extends AbstractSchema {
+private Set tableNames = Sets.newHashSet();
+private final ConcurrentMap tables = new 
ConcurrentSkipListMap<>(String::compareToIgnoreCase);
+public IPFSTables (String name) {
+  super(ImmutableList.of(), name);
+  tableNames.add(name);
+}
+
+public void setHolder(SchemaPlus pulsOfThis) {
+}
+
+@Override
+public String getTypeName() {
+  return IPFSStoragePluginConfig.NAME;
+}
+
+@Override
+public Set getTableNames() {
+  return Collections.emptySet();
+}
+
+@Override
+public Table getTable(String tableName) {
+  //TODO: better handling of table names

Review comment:
   This is actually related to writer support. The initial design was to 
use a placeholder name for a yet-to-create table on IPFS, e.g. ``ipfs.`create` 
``. Since the table names are hashes of the content, they cannot be known 
before they are created. I could delete this part of code, they don't do 
anything anyway.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154636#comment-17154636
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-656170430


   > Cleaning up the PR. I was thinking about the unit tests and it might be 
good to include unit tests using Mockito to mock up some of the various 
components. That way we can test at least some of this without the IPFS daemon. 
I can post an example if you'd like.
   
   Would appreciate that.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-07-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153717#comment-17153717
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r449034470



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+
+/*import org.apache.drill.common.expression.SchemaPath;*/
+
+@JsonTypeName("ipfs-sub-scan")
+public class IPFSSubScan extends AbstractBase implements SubScan {
+  private static int IPFS_SUB_SCAN_VALUE = 19155;
+  private final IPFSContext ipfsContext;
+  private final List ipfsSubScanSpecList;

Review comment:
   Can this just be a regular `ArrayList`?  If there's a reason why you 
chose to use this, that's fine, but I've not seen this done that way before.

##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSSubScan.java
##
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.core.JsonParser;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.JsonToken;
+import com.fasterxml.jackson.databind.DeserializationContext;
+import com.fasterxml.jackson.databind.JsonDeserializer;
+import 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147405#comment-17147405
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446672603



##
File path: 
contrib/storage-ipfs/src/test/java/org/apache/drill/exec/store/ipfs/TestIPFSQueries.java
##
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.multihash.Multihash;
+import org.apache.drill.categories.IPFSStorageTest;
+import org.apache.drill.categories.SlowTest;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import static org.junit.Assert.fail;
+
+@Category({SlowTest.class, IPFSStorageTest.class})
+public class TestIPFSQueries extends IPFSTestBase {
+
+  @Test
+  public void testNullQuery() throws Exception {
+testBuilder()
+.sqlQuery(getSelectStar(IPFSHelper.IPFS_NULL_OBJECT))
+.unOrdered()
+.expectsNumRecords(1)

Review comment:
   Since we are running query against the null object, it's expected that 
the result set is empty. However, the test log file says it has one row, while 
the web interface clearly shows "no results". I changed this line to make the 
test pass, but I don't know what's going on here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147384#comment-17147384
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-650790085


   @cgivre it turned out what was blocking the tests was that the default 
number of providers in test config was too large, as a result IPFS could not 
find any other providers in time, thus the `TimeoutException`s. I wish the test 
logs had included full stack traces, which could have saved me hours looking 
into the Drill planner internals... 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147381#comment-17147381
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446669892



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSCompat.java
##
@@ -0,0 +1,202 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.api.JSONParser;
+import io.ipfs.multihash.Multihash;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Consumer;
+
+/*
+ * Compatibility fixes for java-ipfs-http-client library
+ */
+public class IPFSCompat {
+  public final String host;
+  public final int port;
+  private final String version;
+  public final String protocol;
+  public final int readTimeout;
+  public static final int DEFAULT_READ_TIMEOUT = 0;
+
+  public final DHT dht = new DHT();
+  public final Name name = new Name();
+
+  public IPFSCompat(String host, int port) {
+this(host, port, "/api/v0", false, DEFAULT_READ_TIMEOUT);
+  }
+
+  public IPFSCompat(String host, int port, String version, boolean ssl, int 
readTimeout) {
+this.host = host;
+this.port = port;
+
+if(ssl) {
+  this.protocol = "https";
+} else {
+  this.protocol = "http";
+}
+
+this.version = version;
+this.readTimeout = readTimeout;
+  }
+
+  public class DHT {
+public List findpeerListTimeout(Multihash id, int timeout, 
ExecutorService executor) {
+  BlockingQueue> results = new 
LinkedBlockingQueue<>();
+  executor.submit(() -> retrieveAndParseStream("dht/findpeer?arg=" + id, 
results));
+
+  try {
+long stop = System.currentTimeMillis() + 
TimeUnit.SECONDS.toMillis(timeout);
+while(System.currentTimeMillis() < stop) {
+  Map peer = (Map) results.poll(timeout, TimeUnit.SECONDS);
+  if ( peer != null ) {
+if ( (int) peer.get("Type") == 2 ) {
+  return (List)
+  ((Map)
+  ((List) peer.get("Responses")
+  ).get(0)
+  ).get("Addrs");
+}
+//else: response contains no Addrs, so ignore it.

Review comment:
   I think they are removed in ebc0dc6.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147380#comment-17147380
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446669668



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,191 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")
+private final int maxNodesPerLeaf;
+
+//TODO add more specific timeout configs fot different operations in IPFS,
+// eg. provider resolution, data read, etc.
+@JsonProperty("ipfs-timeouts")
+private final Map ipfsTimeouts;
+
+@JsonIgnore
+private static final Map ipfsTimeoutDefaults = 
ImmutableMap.of(
+IPFSTimeOut.FIND_PROV, 4,
+IPFSTimeOut.FIND_PEER_INFO, 4,
+IPFSTimeOut.FETCH_DATA, 6
+);
+
+public enum IPFSTimeOut {
+@JsonProperty("find-provider")
+FIND_PROV("find-provider"),
+@JsonProperty("find-peer-info")
+FIND_PEER_INFO("find-peer-info"),
+@JsonProperty("fetch-data")
+FETCH_DATA("fetch-data");
+
+@JsonProperty("type")
+private String which;
+IPFSTimeOut(String which) {
+this.which = which;
+}
+
+@JsonCreator
+public static IPFSTimeOut of(String which) {
+switch (which) {
+case "find-provider":
+return FIND_PROV;
+case "find-peer-info":
+return FIND_PEER_INFO;
+case "fetch-data":
+return FETCH_DATA;
+default:
+throw new InvalidParameterException("Unknown key for IPFS 
timeout config entry: " + which);
+}
+}
+
+@Override
+public String toString() {
+return this.which;
+}
+}
+
+@JsonProperty("groupscan-worker-threads")
+private final int numWorkerThreads;
+
+@JsonProperty
+private final Map formats;
+
+@JsonCreator
+public IPFSStoragePluginConfig(
+@JsonProperty("host") String host,
+@JsonProperty("port") int port,
+@JsonProperty("max-nodes-per-leaf") int maxNodesPerLeaf,
+@JsonProperty("ipfs-timeouts") Map ipfsTimeouts,
+@JsonProperty("groupscan-worker-threads") int numWorkerThreads,
+@JsonProperty("formats") Map formats) {
+this.host = host;
+this.port = port;
+this.maxNodesPerLeaf = maxNodesPerLeaf > 0 ? maxNodesPerLeaf : 1;
+//TODO Jackson failed to deserialize the ipfsTimeouts map causing NPE
+if (ipfsTimeouts != null) {

Review comment:
   Hmm, it seems that this comment was made very early in development, and 
the issue it describes no longer exists. I deleted the comment in 282a89d.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For 

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147379#comment-17147379
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446669488



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,191 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")

Review comment:
   I don't why, but removing this annotation seems to make the tests hang 
forever.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147378#comment-17147378
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446669408



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,191 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")
+private final int maxNodesPerLeaf;
+
+//TODO add more specific timeout configs fot different operations in IPFS,
+// eg. provider resolution, data read, etc.
+@JsonProperty("ipfs-timeouts")
+private final Map ipfsTimeouts;
+
+@JsonIgnore
+private static final Map ipfsTimeoutDefaults = 
ImmutableMap.of(
+IPFSTimeOut.FIND_PROV, 4,
+IPFSTimeOut.FIND_PEER_INFO, 4,
+IPFSTimeOut.FETCH_DATA, 6
+);
+
+public enum IPFSTimeOut {
+@JsonProperty("find-provider")
+FIND_PROV("find-provider"),
+@JsonProperty("find-peer-info")
+FIND_PEER_INFO("find-peer-info"),
+@JsonProperty("fetch-data")
+FETCH_DATA("fetch-data");
+
+@JsonProperty("type")
+private String which;
+IPFSTimeOut(String which) {
+this.which = which;
+}
+
+@JsonCreator
+public static IPFSTimeOut of(String which) {
+switch (which) {
+case "find-provider":
+return FIND_PROV;
+case "find-peer-info":
+return FIND_PEER_INFO;
+case "fetch-data":
+return FETCH_DATA;
+default:
+throw new InvalidParameterException("Unknown key for IPFS 
timeout config entry: " + which);
+}
+}
+
+@Override
+public String toString() {
+return this.which;
+}
+}
+
+@JsonProperty("groupscan-worker-threads")
+private final int numWorkerThreads;
+
+@JsonProperty
+private final Map formats;
+
+@JsonCreator
+public IPFSStoragePluginConfig(
+@JsonProperty("host") String host,
+@JsonProperty("port") int port,
+@JsonProperty("max-nodes-per-leaf") int maxNodesPerLeaf,
+@JsonProperty("ipfs-timeouts") Map ipfsTimeouts,
+@JsonProperty("groupscan-worker-threads") int numWorkerThreads,
+@JsonProperty("formats") Map formats) {
+this.host = host;
+this.port = port;
+this.maxNodesPerLeaf = maxNodesPerLeaf > 0 ? maxNodesPerLeaf : 1;
+//TODO Jackson failed to deserialize the ipfsTimeouts map causing NPE
+if (ipfsTimeouts != null) {
+ipfsTimeoutDefaults.forEach(ipfsTimeouts::putIfAbsent);
+} else {
+ipfsTimeouts = ipfsTimeoutDefaults;
+}
+this.ipfsTimeouts = ipfsTimeouts;
+this.numWorkerThreads = numWorkerThreads > 0 ? numWorkerThreads : 1;
+this.formats = formats;
+}
+
+public String getHost() {
+return host;
+}
+
+public int getPort() {
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147375#comment-17147375
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446667334



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSStoragePluginConfig.java
##
@@ -0,0 +1,191 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableMap;
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+
+import java.security.InvalidParameterException;
+import java.util.Map;
+
+@JsonTypeName(IPFSStoragePluginConfig.NAME)
+public class IPFSStoragePluginConfig extends StoragePluginConfigBase{
+static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(IPFSStoragePluginConfig.class);
+
+public static final String NAME = "ipfs";
+
+private final String host;
+private final int port;
+
+@JsonProperty("max-nodes-per-leaf")
+private final int maxNodesPerLeaf;
+
+//TODO add more specific timeout configs fot different operations in IPFS,
+// eg. provider resolution, data read, etc.
+@JsonProperty("ipfs-timeouts")
+private final Map ipfsTimeouts;
+
+@JsonIgnore
+private static final Map ipfsTimeoutDefaults = 
ImmutableMap.of(
+IPFSTimeOut.FIND_PROV, 4,
+IPFSTimeOut.FIND_PEER_INFO, 4,
+IPFSTimeOut.FETCH_DATA, 6
+);
+
+public enum IPFSTimeOut {
+@JsonProperty("find-provider")
+FIND_PROV("find-provider"),
+@JsonProperty("find-peer-info")
+FIND_PEER_INFO("find-peer-info"),
+@JsonProperty("fetch-data")
+FETCH_DATA("fetch-data");
+
+@JsonProperty("type")
+private String which;
+IPFSTimeOut(String which) {
+this.which = which;
+}
+
+@JsonCreator
+public static IPFSTimeOut of(String which) {
+switch (which) {
+case "find-provider":
+return FIND_PROV;
+case "find-peer-info":
+return FIND_PEER_INFO;
+case "fetch-data":
+return FETCH_DATA;
+default:
+throw new InvalidParameterException("Unknown key for IPFS 
timeout config entry: " + which);
+}
+}
+
+@Override
+public String toString() {
+return this.which;
+}
+}
+
+@JsonProperty("groupscan-worker-threads")
+private final int numWorkerThreads;
+
+@JsonProperty
+private final Map formats;
+
+@JsonCreator
+public IPFSStoragePluginConfig(
+@JsonProperty("host") String host,
+@JsonProperty("port") int port,
+@JsonProperty("max-nodes-per-leaf") int maxNodesPerLeaf,
+@JsonProperty("ipfs-timeouts") Map ipfsTimeouts,
+@JsonProperty("groupscan-worker-threads") int numWorkerThreads,
+@JsonProperty("formats") Map formats) {
+this.host = host;
+this.port = port;
+this.maxNodesPerLeaf = maxNodesPerLeaf > 0 ? maxNodesPerLeaf : 1;
+//TODO Jackson failed to deserialize the ipfsTimeouts map causing NPE
+if (ipfsTimeouts != null) {
+ipfsTimeoutDefaults.forEach(ipfsTimeouts::putIfAbsent);
+} else {
+ipfsTimeouts = ipfsTimeoutDefaults;
+}
+this.ipfsTimeouts = ipfsTimeouts;
+this.numWorkerThreads = numWorkerThreads > 0 ? numWorkerThreads : 1;
+this.formats = formats;
+}
+
+public String getHost() {
+return host;
+}
+
+public int getPort() {
+

[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147374#comment-17147374
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 commented on a change in pull request #2084:
URL: https://github.com/apache/drill/pull/2084#discussion_r446665500



##
File path: 
contrib/storage-ipfs/src/main/java/org/apache/drill/exec/store/ipfs/IPFSPeer.java
##
@@ -0,0 +1,107 @@
+/*
+ * Copyright (c) 2018-2020 Bowen Ding, Yuedong Xu, Liang Wang
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.drill.exec.store.ipfs;
+
+import io.ipfs.multiaddr.MultiAddress;
+import io.ipfs.multihash.Multihash;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Optional;
+
+public class IPFSPeer {
+  private IPFSHelper helper;
+
+  private Multihash id;
+  private List addrs;
+  private boolean isDrillReady;
+  private boolean isDrillReadyChecked = false;
+  private Optional drillbitAddress = Optional.empty();
+  private boolean drillbitAddressChecked = false;
+
+
+  public IPFSPeer(IPFSHelper helper, Multihash id) {
+this.helper = helper;
+this.id = id;
+  }
+
+  IPFSPeer(IPFSHelper helper, Multihash id, List addrs) {
+this.helper = helper;
+this.id = id;
+this.addrs = addrs;
+this.isDrillReady = helper.isDrillReady(id);
+this.isDrillReadyChecked = true;
+this.drillbitAddress = IPFSHelper.pickPeerHost(addrs);
+this.drillbitAddressChecked = true;
+  }
+
+  public boolean isDrillReady() {
+if (!isDrillReadyChecked) {
+  isDrillReady = helper.isDrillReady(id);
+  isDrillReadyChecked = true;
+}
+return isDrillReady;
+  }
+
+  public boolean hasDrillbitAddress() {

Review comment:
   Changed in 160a909.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147205#comment-17147205
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

cgivre commented on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-650694408


   > @cgivre I've added more tests. The tests are not passing, something about 
`Error while applying rule DrillScanRule`. However, I was able to successfully 
execute the test queries through Drill web interface. I don't know how to fix 
these tests?
   > 
   > Edit: attach log file.
   > 
[org.apache.drill.exec.store.ipfs.TestIPFSQueries.txt](https://github.com/apache/drill/files/4840854/org.apache.drill.exec.store.ipfs.TestIPFSQueries.txt)
   
   It appears that the query is not getting through the planning phase.  My 
suggestion is to take a look at this tutorial about writing storage plugins:
   https://github.com/paul-rogers/drill/wiki/Storage-Plugin, and 
   and specifically, follow the debugging procedures that Paul outlines.  My 
hunch here is that something is going wrong with the  schema resolution. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7745) Add storage plugin for IPFS

2020-06-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147017#comment-17147017
 ] 

ASF GitHub Bot commented on DRILL-7745:
---

dbw9580 edited a comment on pull request #2084:
URL: https://github.com/apache/drill/pull/2084#issuecomment-650577819


   @cgivre I've added more tests. The tests are not passing, something about 
`Error while applying rule DrillScanRule`. However, I was able to successfully 
execute the test queries through Drill web interface. I don't know how to fix 
these tests?
   
   Edit: attach log file.
   
[org.apache.drill.exec.store.ipfs.TestIPFSQueries.txt](https://github.com/apache/drill/files/4840854/org.apache.drill.exec.store.ipfs.TestIPFSQueries.txt)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add storage plugin for IPFS
> ---
>
> Key: DRILL-7745
> URL: https://issues.apache.org/jira/browse/DRILL-7745
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Bowen Ding
>Assignee: Bowen Ding
>Priority: Major
>
> See introduction here: [https://github.com/bdchain/Minerva]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >