[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643620#comment-16643620
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

Jicaar commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-428235775
 
 
   Hi Sorry I was off grid for a bit. Thanks @bmeriaux for picking up my slack! 
:)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: Cassandra Connector, DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642942#comment-16642942
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol closed pull request #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
 
b/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
index e0806fe2056..cc07fb70799 100644
--- 
a/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
+++ 
b/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
@@ -17,25 +17,12 @@
 
 package org.apache.flink.batch.connectors.cassandra;
 
-import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
-import org.apache.flink.api.common.io.NonParallelInput;
-import org.apache.flink.api.common.io.RichInputFormat;
-import org.apache.flink.api.common.io.statistics.BaseStatistics;
 import org.apache.flink.api.java.tuple.Tuple;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.core.io.GenericInputSplit;
 import org.apache.flink.core.io.InputSplit;
-import org.apache.flink.core.io.InputSplitAssigner;
 import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
-import org.apache.flink.util.Preconditions;
 
-import com.datastax.driver.core.Cluster;
 import com.datastax.driver.core.ResultSet;
 import com.datastax.driver.core.Row;
-import com.datastax.driver.core.Session;
-import com.google.common.base.Strings;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
 
@@ -44,38 +31,19 @@
  *
  * @param  type of Tuple
  */
-public class CassandraInputFormat extends 
RichInputFormat implements NonParallelInput {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+public class CassandraInputFormat extends 
CassandraInputFormatBase {
 
-   private final String query;
-   private final ClusterBuilder builder;
-
-   private transient Cluster cluster;
-   private transient Session session;
+   private static final long serialVersionUID = 3642323148032444264L;
private transient ResultSet resultSet;
 
public CassandraInputFormat(String query, ClusterBuilder builder) {
-   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
-   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
-
-   this.query = query;
-   this.builder = builder;
-   }
-
-   @Override
-   public void configure(Configuration parameters) {
-   this.cluster = builder.getCluster();
-   }
-
-   @Override
-   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
-   return cachedStatistics;
+   super(query, builder);
}
 
/**
 * Opens a Session and executes the query.
 *
-* @param ignored
+* @param ignored because parameter is not parallelizable.
 * @throws IOException
 */
@Override
@@ -97,37 +65,4 @@ public OUT nextRecord(OUT reuse) throws IOException {
}
return reuse;
}
-
-   @Override
-   public InputSplit[] createInputSplits(int minNumSplits) throws 
IOException {
-   GenericInputSplit[] split = {new GenericInputSplit(0, 1)};
-   return split;
-   }
-
-   @Override
-   public InputSplitAssigner getInputSplitAssigner(InputSplit[] 
inputSplits) {
-   return new DefaultInputSplitAssigner(inputSplits);
-   }
-
-   /**
-* Closes all resources used.
-*/
-   @Override
-   public void close() throws IOException {
-   try {
-   if (session != null) {
-   session.close();
-   }
-   } catch (Exception e) {
-   LOG.error("Error while closing session.", e);
-   }
-
-   try {
-   if (cluster != null) {
-   cluster.close();
-   }
-   } catch (Exception e) {
-   LOG.error("Error while closing 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642940#comment-16642940
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-428102111
 
 
   Subsumed in #6735.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642941#comment-16642941
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol closed pull request #5876: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
 
b/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
index e0806fe2056..60e30669267 100644
--- 
a/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
+++ 
b/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
@@ -17,25 +17,12 @@
 
 package org.apache.flink.batch.connectors.cassandra;
 
-import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
-import org.apache.flink.api.common.io.NonParallelInput;
-import org.apache.flink.api.common.io.RichInputFormat;
-import org.apache.flink.api.common.io.statistics.BaseStatistics;
 import org.apache.flink.api.java.tuple.Tuple;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.core.io.GenericInputSplit;
 import org.apache.flink.core.io.InputSplit;
-import org.apache.flink.core.io.InputSplitAssigner;
 import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
-import org.apache.flink.util.Preconditions;
 
-import com.datastax.driver.core.Cluster;
 import com.datastax.driver.core.ResultSet;
 import com.datastax.driver.core.Row;
-import com.datastax.driver.core.Session;
-import com.google.common.base.Strings;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
 
@@ -44,38 +31,18 @@
  *
  * @param  type of Tuple
  */
-public class CassandraInputFormat extends 
RichInputFormat implements NonParallelInput {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+public class CassandraInputFormat extends 
CassandraInputFormatBase {
 
-   private final String query;
-   private final ClusterBuilder builder;
-
-   private transient Cluster cluster;
-   private transient Session session;
private transient ResultSet resultSet;
 
public CassandraInputFormat(String query, ClusterBuilder builder) {
-   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
-   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
-
-   this.query = query;
-   this.builder = builder;
-   }
-
-   @Override
-   public void configure(Configuration parameters) {
-   this.cluster = builder.getCluster();
-   }
-
-   @Override
-   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
-   return cachedStatistics;
+   super(query, builder);
}
 
/**
 * Opens a Session and executes the query.
 *
-* @param ignored
+* @param ignored because parameter is not parallelizable.
 * @throws IOException
 */
@Override
@@ -97,37 +64,4 @@ public OUT nextRecord(OUT reuse) throws IOException {
}
return reuse;
}
-
-   @Override
-   public InputSplit[] createInputSplits(int minNumSplits) throws 
IOException {
-   GenericInputSplit[] split = {new GenericInputSplit(0, 1)};
-   return split;
-   }
-
-   @Override
-   public InputSplitAssigner getInputSplitAssigner(InputSplit[] 
inputSplits) {
-   return new DefaultInputSplitAssigner(inputSplits);
-   }
-
-   /**
-* Closes all resources used.
-*/
-   @Override
-   public void close() throws IOException {
-   try {
-   if (session != null) {
-   session.close();
-   }
-   } catch (Exception e) {
-   LOG.error("Error while closing session.", e);
-   }
-
-   try {
-   if (cluster != null) {
-   cluster.close();
-   }
-   } catch (Exception e) {
-   LOG.error("Error while closing cluster.", e);
-   }
-   }
 }
diff --git 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636986#comment-16636986
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r222314129
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -382,31 +383,31 @@ public void testCassandraCommitter() throws Exception {
@Test
public void testCassandraTupleAtLeastOnceSink() throws Exception {
CassandraTupleSink> sink = new 
CassandraTupleSink<>(injectTableName(INSERT_DATA_QUERY), builder);
-
-   sink.open(new Configuration());
-
-   for (Tuple3 value : collection) {
-   sink.send(value);
+   try {
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636985#comment-16636985
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r222314062
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraOutputFormatBase.java
 ##
 @@ -41,7 +41,7 @@
  * @param  Type of the elements to write.
  */
 public abstract class CassandraOutputFormatBase extends 
RichOutputFormat {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraOutputFormatBase.class);
+   private final Logger logger = LoggerFactory.getLogger(getClass());
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635201#comment-16635201
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221881385
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(null);
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
+   
result.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+   
customCassandraAnnotatedPojos.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+
+   for (int i = 0; i < result.size(); i++) {
+   assertThat(result.get(i), 
samePropertyValuesAs(customCassandraAnnotatedPojos.get(i)));
 
 Review comment:
   its fine to sort them, but you don't need to iterate over the lists.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635203#comment-16635203
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221883320
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
 
 Review comment:
   Let's keep it for now but update it once the outputformat is in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635199#comment-16635199
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221882439
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraOutputFormatBase.java
 ##
 @@ -41,7 +41,7 @@
  * @param  Type of the elements to write.
  */
 public abstract class CassandraOutputFormatBase extends 
RichOutputFormat {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraOutputFormatBase.class);
+   private final Logger logger = LoggerFactory.getLogger(getClass());
 
 Review comment:
   Please revert changes to this class; this PR is for the InputFormat and 
shouldn't touch other classes unless absolutely necessary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635202#comment-16635202
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221882020
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -382,31 +383,31 @@ public void testCassandraCommitter() throws Exception {
@Test
public void testCassandraTupleAtLeastOnceSink() throws Exception {
CassandraTupleSink> sink = new 
CassandraTupleSink<>(injectTableName(INSERT_DATA_QUERY), builder);
-
-   sink.open(new Configuration());
-
-   for (Tuple3 value : collection) {
-   sink.send(value);
+   try {
 
 Review comment:
   please limit your changes to your sink, or move these into a separate commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635200#comment-16635200
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221881036
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635192#comment-16635192
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221880693
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
 
 Review comment:
   We've had enough issues in the past with writes not showing up immediately 
to warrant such safeguards.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634591#comment-16634591
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221747948
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(null);
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
+   
result.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+   
customCassandraAnnotatedPojos.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+
+   for (int i = 0; i < result.size(); i++) {
+   assertThat(result.get(i), 
samePropertyValuesAs(customCassandraAnnotatedPojos.get(i)));
 
 Review comment:
   it's a reflex for me to explicitly sort collection before assertions, i 
think it is more explicit/easier to understand, but i can remove it if it is 
not in standard :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634529#comment-16634529
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221731262
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
 
 Review comment:
   ok, it may be a little paranoid nop ? :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634517#comment-16634517
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221727577
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
 
 Review comment:
   I have code for a CassandraPojoOutputFormat in a comming PR which will be 
more appropriate i agree, but i need this to be merged. Otherwise, should i 
revert to tuple for insertion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634510#comment-16634510
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221726367
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
 
 Review comment:
   this is necessary because i used the sink with a pojo, and the table used is 
the one defined by annotation


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633912#comment-16633912
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat 
to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-425884580
 
 
   Hi @bmeriaux, we can merge it after @zentol's comments are resolved. We 
usually merge new features or functionality into the new releases unless there 
is good reason to port to bug fix releases. I suggest to update to 1.7 in 
future to use new Cassandra connectors. There should be good compatibility 
between 1.6 and 1.7. We can discuss it if there is any problem with not having 
Cassandra updates in 1.6.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633889#comment-16633889
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221578715
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
 
 Review comment:
   let's add a sanity check that the table actually contains 20 records to 
guard against errors during the writing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633884#comment-16633884
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221577237
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
 
 Review comment:
   this should be done in a `finally` block


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633882#comment-16633882
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221574858
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Base class for {@link RichInputFormat} read data from Apache Cassandra and 
generate a custom Cassandra annotated object.
+ * @param  type of inputClass
+ */
+abstract class CassandraInputFormatBase extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
 
 Review comment:
   Let's make this non-static and initialize the logger with 
`LoggerFactory.getLogger(getClass())`; then sub-classes don't need to create 
their own logger, we can have a single source for logging messages, and the 
class is the actually user-facing one (which is what one would expect).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633888#comment-16633888
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221578896
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(null);
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
+   
result.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+   
customCassandraAnnotatedPojos.sort(Comparator.comparingInt(CustomCassandraAnnotatedPojo::getCounter));
+
+   for (int i = 0; i < result.size(); i++) {
+   assertThat(result.get(i), 
samePropertyValuesAs(customCassandraAnnotatedPojos.get(i)));
 
 Review comment:
   since they are sorted already I believe we could just compare the lists 
directly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633886#comment-16633886
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221578491
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
 
 Review comment:
   I'm a bit torn on this one. I kinda dislike that we're using other sinks for 
the verification; this test could now fail if the `CassandraPojoSink` is 
modified in some incompatible way, which simply shouldn't be possible.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633891#comment-16633891
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221574858
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Base class for {@link RichInputFormat} read data from Apache Cassandra and 
generate a custom Cassandra annotated object.
+ * @param  type of inputClass
+ */
+abstract class CassandraInputFormatBase extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
 
 Review comment:
   Let's make this non-static and initialize the logger with 
`LoggerFactory.getLogger(getClass())`; then sub-classes don't need to create 
their own logger, we can have a single source for logging messages, and the 
logged class is the actually user-facing one (which is what one would expect).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633883#comment-16633883
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221575749
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}/{@link CassandraTupleOutputFormat} in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS flink WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS flink.batches (id text, counter int, batch_id 
int, PRIMARY KEY(id, counter, batchId));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO flink.batches 
(id, counter, batch_id) VALUES (?,?,?);";
+   private static final String SELECT_QUERY = "SELECT id, counter, 
batch_id FROM flink.batches;";
+
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple3<>("string " + i, i, i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   ClusterBuilder clusterBuilder = new ClusterBuilder() {
+   private static final long serialVersionUID = 
-1754532803757154795L;
+
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   };
+
+   dataSet.output(new CassandraTupleOutputFormat<>(INSERT_QUERY, 
clusterBuilder));
+
+   env.execute("Write");
+
+   /*
+*  This is for the purpose of showing an example of 
creating a DataSet using CassandraPojoInputFormat.
+*/
+   DataSet inputDS = env
+   .createInput(new 
CassandraPojoInputFormat<>(SELECT_QUERY, clusterBuilder, 
CustomCassandraAnnotatedPojo.class, () -> new 
Mapper.Option[]{Mapper.Option.consistencyLevel(ConsistencyLevel.ANY)}));
 
 Review comment:
   place arguments on separate lines for readability


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633890#comment-16633890
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221575216
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/CustomCassandraAnnotatedPojo.java
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import com.datastax.driver.mapping.annotations.Column;
+import com.datastax.driver.mapping.annotations.Table;
+
+/**
+ * Example of Cassandra Annotated POJO class for use with {@link 
CassandraPojoInputFormat}.
+ */
+@Table(name = "batches", keyspace = "flink")
+public class CustomCassandraAnnotatedPojo {
+
+
 
 Review comment:
   double empty-lines should be prohibited by checkstyle, don't know why this 
isn't failing :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633887#comment-16633887
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221576843
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
 
 Review comment:
   This shouldn't be necessary; the test infrastructure already creates a table.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633885#comment-16633885
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

zentol commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r221577251
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +477,41 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   session.execute(CREATE_TABLE_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"));
+
+   CassandraPojoSink sink = new 
CassandraPojoSink<>(CustomCassandraAnnotatedPojo.class, builder);
+   sink.open(new Configuration());
+
+   List 
customCassandraAnnotatedPojos = IntStream.range(0, 20)
+   .mapToObj(x -> new 
CustomCassandraAnnotatedPojo(UUID.randomUUID().toString(), x, 0))
+   .collect(Collectors.toList());
+
+   customCassandraAnnotatedPojos.forEach(sink::send);
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(SELECT_DATA_QUERY.replace(TABLE_NAME_VARIABLE, 
"batches"), builder, CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(null);
+   result.add(temp);
+   }
+
+   source.close();
 
 Review comment:
   this should be done in a `finally` block


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633041#comment-16633041
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-425653056
 
 
   hi @azagrebin, 
   any idea when this will be merged ? in 1.6.2 ? or it will wait 1.7.0 ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628841#comment-16628841
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424733346
 
 
   I have created an issue for `CassandraPojoOutputFormat` and have the code 
for it, it will wait for the merge of this one :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628838#comment-16628838
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424732776
 
 
   ok, thanks, i just repush to change my email commit author field :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628516#comment-16628516
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux edited a comment on issue #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424661360
 
 
   ok, no problem, it will be a little more complex than just to extend 
`CassandraOutputFormatBase` to do `CassandraPojoOutputFormat`, but i could do 
it later in an other PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628515#comment-16628515
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424661360
 
 
   ok, no problem, it will be a little more complex than just to extend 
`CassandraOutputFormatBase` to do `CassandraPojoOutputFormat`, but i could do 
it later in another PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628502#comment-16628502
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat 
to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424658491
 
 
   Great, looks good! I think committer will do it but rebasing on master is 
always a good thing.
   
   The `CassandraPojoSink` is actually for streaming API. Unless we want to 
extend `CassandraOutputFormatBase` with `CassandraPojoOutputFormat`, I think it 
is better to use `CassandraTupleOutputFormat` for batch example in this PR, as 
it was. 
   
   Sorry for confusion, can we roll it back to `CassandraTupleOutputFormat` for 
batch example before merge (with `execute("write")` after `output`)? 
`CassandraOutputFormatBase` can be extended with `CassandraPojoOutputFormat` in 
other PR if needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628416#comment-16628416
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-424631434
 
 
   I think it is ready now :), should I rebase on master ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627249#comment-16627249
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220167397
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
 
 Review comment:
   I think we actually consistently use `checkNotNull` mostly in Flink, also 
java util packages throw `NullPointerException` for unsupported `null` args.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627241#comment-16627241
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220167397
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
 
 Review comment:
   Well, I think we consistently use `checkNotNull` mostly in Flink, also java 
util packages throw `NullPointerException` for unsupported `null` args.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627242#comment-16627242
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220165515
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
 ##
 @@ -44,90 +31,38 @@
  *
  * @param  type of Tuple
  */
-public class CassandraInputFormat extends 
RichInputFormat implements NonParallelInput {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+public class CassandraInputFormat extends 
CassandraInputFormatBase {
 
-   private final String query;
-   private final ClusterBuilder builder;
-
-   private transient Cluster cluster;
-   private transient Session session;
+   private static final long serialVersionUID = 3642323148032444264L;
private transient ResultSet resultSet;
 
public CassandraInputFormat(String query, ClusterBuilder builder) {
-   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
-   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
-
-   this.query = query;
-   this.builder = builder;
-   }
-
-   @Override
-   public void configure(Configuration parameters) {
-   this.cluster = builder.getCluster();
-   }
-
-   @Override
-   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
-   return cachedStatistics;
+   super(query, builder);
}
 
/**
 * Opens a Session and executes the query.
 *
-* @param ignored
+* @param ignored because parameter is not parallelizable.
 
 Review comment:
   the exception can also go from the doc string


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627240#comment-16627240
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220164741
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Base class for {@link RichInputFormat} read data from Apache Cassandra and 
generate a custom Cassandra annotated object.
+ * @param  type of inputClass
+ */
+abstract class CassandraInputFormatBase extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+   private static final long serialVersionUID = -1519372881115104601L;
+
+   final String query;
+   final ClusterBuilder builder;
 
 Review comment:
   can be private


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627239#comment-16627239
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220164898
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Base class for {@link RichInputFormat} read data from Apache Cassandra and 
generate a custom Cassandra annotated object.
+ * @param  type of inputClass
+ */
+abstract class CassandraInputFormatBase extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+   private static final long serialVersionUID = -1519372881115104601L;
+
+   final String query;
+   final ClusterBuilder builder;
+
+   transient Cluster cluster;
+   transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
 
 Review comment:
   can be also package-private


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626967#comment-16626967
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220093326
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +472,34 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   OutputFormat> sink = new 
CassandraTupleOutputFormat<>(injectTableName(INSERT_DATA_QUERY), builder);
+   sink.configure(new Configuration());
+   sink.open(0, 1);
+
+   for (Tuple3 value : collection) {
+   sink.writeRecord(value);
+   }
+
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(injectTableName(SELECT_DATA_QUERY), builder, 
CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(new CustomCassandraAnnotatedPojo());
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
 
 Review comment:
   maybe example would be also more consistent if we use `CassandraPojoSink` 
there as well to populate the table, instead of raw tuples.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626964#comment-16626964
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r220092435
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }));
+
 
 Review comment:
   I think it should be first creating output then executing:
   ```
   dataSet.output(new CassandraTupleOutputFormat<>(INSERT_QUERY, 
clusterBuilder));
   env.execute("Write");
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626449#comment-16626449
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219984964
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +472,34 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   OutputFormat> sink = new 
CassandraTupleOutputFormat<>(injectTableName(INSERT_DATA_QUERY), builder);
+   sink.configure(new Configuration());
+   sink.open(0, 1);
+
+   for (Tuple3 value : collection) {
+   sink.writeRecord(value);
+   }
+
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(injectTableName(SELECT_DATA_QUERY), builder, 
CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(new CustomCassandraAnnotatedPojo());
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
 
 Review comment:
   To save easily a list of CustomCassandraAnnotatedPojo, i used the 
CassandraPojoSink


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626435#comment-16626435
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219980389
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.streaming.connectors.cassandra.MapperOptions;
+import org.apache.flink.util.Preconditions;
+
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ *
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends 
CassandraInputFormatBase {
+
+   private static final long serialVersionUID = 1992091320180905115L;
+   private final MapperOptions mapperOptions;
+
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   this(query, builder, inputClass, null);
+   }
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass, MapperOptions mapperOptions) {
+   super(query, builder);
+   this.mapperOptions = mapperOptions;
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
 
 Review comment:
   like previous one, i think we should stick on checkArgument to keep a 
IllegalArgumentException


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626308#comment-16626308
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219957391
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
 
 Review comment:
   it will throw a NullPointerException instead of IllegallArgumentException, 
like others classes of the connector, i think we should keep checkArgument


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626240#comment-16626240
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219941080
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
 
 Review comment:
   does `-/CassandraOutputFormat` mean something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626239#comment-16626239
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219940842
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorITCase.java
 ##
 @@ -470,6 +472,34 @@ public void testCassandraTableSink() throws Exception {
Assert.assertTrue("The input data was not completely written to 
Cassandra", input.isEmpty());
}
 
+   @Test
+   public void testCassandraBatchPojoFormat() throws Exception {
+
+   OutputFormat> sink = new 
CassandraTupleOutputFormat<>(injectTableName(INSERT_DATA_QUERY), builder);
+   sink.configure(new Configuration());
+   sink.open(0, 1);
+
+   for (Tuple3 value : collection) {
+   sink.writeRecord(value);
+   }
+
+   sink.close();
+
+   InputFormat source = 
new CassandraPojoInputFormat<>(injectTableName(SELECT_DATA_QUERY), builder, 
CustomCassandraAnnotatedPojo.class);
+   source.configure(new Configuration());
+   source.open(null);
+
+   List result = new ArrayList<>();
+
+   while (!source.reachedEnd()) {
+   CustomCassandraAnnotatedPojo temp = 
source.nextRecord(new CustomCassandraAnnotatedPojo());
+   result.add(temp);
+   }
+
+   source.close();
+   Assert.assertEquals(20, result.size());
 
 Review comment:
   Can we also convert `collection` of tuples into list of 
`CustomCassandraAnnotatedPojo` and check the contents as well, additionally to 
size?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626244#comment-16626244
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219932299
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
 
 Review comment:
   Comment:
   Base class for {@link RichInputFormat} to read data from Apache Cassandra.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626237#comment-16626237
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219943237
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
 
 Review comment:
   we can remove `Tuple2` and use empty `<>` (two places)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626236#comment-16626236
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219935777
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/CustomCassandraAnnotatedPojo.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import com.datastax.driver.mapping.annotations.Column;
+import com.datastax.driver.mapping.annotations.Table;
+
+/**
+ * Example of Cassandra Annotated POJO class for use with 
CassandraInputFormatter.
 
 Review comment:
   In doc comment:
   `CassandraInputFormatter` -> `{@link CassandraPojoInputFormat}`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626241#comment-16626241
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219942318
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
 
 Review comment:
   I think there is still some mismatch between queries, doc comments, tuples 
in `dataSet ` and fields of `CustomCassandraAnnotatedPojo`. 
`testCassandraBatchPojoFormat` looks better from this side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626245#comment-16626245
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219934055
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.streaming.connectors.cassandra.MapperOptions;
+import org.apache.flink.util.Preconditions;
+
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ *
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends 
CassandraInputFormatBase {
+
+   private static final long serialVersionUID = 1992091320180905115L;
+   private final MapperOptions mapperOptions;
+
+   private transient Result resultSet;
+   private Class inputClass;
 
 Review comment:
   can be `final`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626232#comment-16626232
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219929097
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormat.java
 ##
 @@ -44,38 +31,18 @@
  *
  * @param  type of Tuple
  */
-public class CassandraInputFormat extends 
RichInputFormat implements NonParallelInput {
-   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+public class CassandraInputFormat extends 
CassandraInputFormatBase {
 
 Review comment:
   please add `serialVersionUID` to `CassandraInputFormat` and 
`CassandraInputFormatBase`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626234#comment-16626234
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219929908
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
 
 Review comment:
   I think we can use `checkNotNull`:
   ```
   Preconditions.checkNotNull(builder, "Builder cannot be null");
   ```
   for readability


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626238#comment-16626238
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219930899
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+
+   this.query = query;
+   this.builder = builder;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
 
 Review comment:
   We can remove exception declarations from all methods (also in other touched 
classes) where they are not actually thrown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626235#comment-16626235
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219942450
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }));
+
 
 Review comment:
   I think `env.execute("Write");` should be here, not at the end.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626233#comment-16626233
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219934197
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.streaming.connectors.cassandra.MapperOptions;
+import org.apache.flink.util.Preconditions;
+
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ *
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends 
CassandraInputFormatBase {
+
+   private static final long serialVersionUID = 1992091320180905115L;
+   private final MapperOptions mapperOptions;
+
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   this(query, builder, inputClass, null);
+   }
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass, MapperOptions mapperOptions) {
+   super(query, builder);
+   this.mapperOptions = mapperOptions;
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
 
 Review comment:
   `checkNotNull`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626246#comment-16626246
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219943693
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.ConsistencyLevel;
+import com.datastax.driver.mapping.Mapper;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the {@link 
CassandraPojoInputFormat}-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }));
+
+   /*
+*  This is for the purpose of showing an example of 
creating a DataSet using CassandraPojoInputFormat.
+*/
+   DataSet inputDS = env
+   .createInput(new 
CassandraPojoInputFormat<>(SELECT_QUERY, new ClusterBuilder() {
 
 Review comment:
   `ClusterBuilder` should also have `serialVersionUID`. It also duplicated in 
2 places. We can create a variable to define it once.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626242#comment-16626242
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219931705
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormatBase.class);
+
+   protected final String query;
+   protected final ClusterBuilder builder;
+
+   protected transient Cluster cluster;
+   protected transient Session session;
+
+   public CassandraInputFormatBase(String query, ClusterBuilder builder){
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+
+   this.query = query;
+   this.builder = builder;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   @Override
+   public InputSplit[] createInputSplits(int minNumSplits) throws 
IOException {
+   GenericInputSplit[] split = {new GenericInputSplit(0, 1)};
+   return split;
 
 Review comment:
   `split` local variable is redundant, we can just return `{new 
GenericInputSplit(0, 1)}`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626243#comment-16626243
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#discussion_r219930598
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraInputFormatBase.java
 ##
 @@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public abstract class CassandraInputFormatBase extends 
RichInputFormat implements NonParallelInput {
 
 Review comment:
   I think the whole class, constructor and fields can be package private


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624778#comment-16624778
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #6735: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735#issuecomment-423762632
 
 
   continuation of #5876 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624776#comment-16624776
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux opened a new pull request #6735: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/6735
 
 
   What is the purpose of the change
   
   Committing the new CassandraPojoInputFormat class. This works similarly to 
the CassandraInputClass, but allows the data that is read from Cassandra to be 
output as a custom POJO that the user has created whichhas been annotated using 
Datastax API.
   Brief change log
   
   -Initial commit of the CassandraPojoInputFormat class and validation test.
   Verifying this change
   
   -CassandraPojoInputFormat can be validated with the 
testCassandraBatchPojoFormat test in the CassandraConnectorITCaseTest.java file.
   Does this pull request potentially affect one of the following parts:
   
   Dependencies (does it add or upgrade a dependency): no
   The public API, i.e., is any changed class annotated with 
@Public(Evolving): no
   The serializers: no
   The runtime per-record code paths (performance sensitive): don't know
   Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
   The S3 file system connector: no
   
   Documentation
   
   Does this pull request introduce a new feature? yes
   If yes, how is the feature documented? JavaDocs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624774#comment-16624774
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux edited a comment on issue #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-423759068
 
 
   hi, 
   yes, I have time to finish the PR but i cannot push on @Jicaar  branch, what 
can I do ?
   it is on my fork: https://github.com/bmeriaux/flink/tree/FLINK-9126


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624756#comment-16624756
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat 
to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-423759953
 
 
   hi, true, you have to clone it, commit on top and open another PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624752#comment-16624752
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-423759068
 
 
   hi, 
   yes, I have time to finish the PR but i cannot push on @Jicaar  branch, what 
can I do ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622002#comment-16622002
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat 
to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-423177459
 
 
   Hi @bmeriaux, @Jicaar
   @bmeriaux, would you have time to polish a bit more and finish the PR? and 
what would be ETA for it then?
   @Jicaar might not have time for this atm (please, correct me if it is not 
the case)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619671#comment-16619671
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

bmeriaux commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat to 
output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-422541148
 
 
   I was ready to do a PR for the same feature :),it looks good to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614487#comment-16614487
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on issue #5876: [FLINK-9126] New CassandraPojoInputFormat 
to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#issuecomment-421264114
 
 
   hi @Jicaar, thanks for working on comments, 
   do you intend to do more changes in this PR or it is ready for review? 
   If more changes are coming, what would be ETA?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610320#comment-16610320
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207954696
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
+
+   this.query = query;
+   this.builder = builder;
+   this.inputClass = inputClass;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   /**
+* Opens a Session and executes the query.
+*
+* @param ignored
+* @throws IOException
+*/
+   @Override
+   public void open(InputSplit ignored) throws IOException {
+   this.session = cluster.connect();
+   MappingManager manager = new MappingManager(session);
+
+   Mapper mapper = manager.mapper(inputClass);
 
 Review comment:
   I would also at least overload the `CassandraPojoInputFormat` constructor 
with additional parameter which is a factory for `Mapper` because now it 
is hidden in the internals of this class but users might want to configure it 
in a special way (e.g. `Mapper.setDefaultSaveOptions(...)`), similar to 
`ClusterBuilder`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603582#comment-16603582
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

Jicaar commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r215059658
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
+
+   this.query = query;
+   this.builder = builder;
+   this.inputClass = inputClass;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   /**
+* Opens a Session and executes the query.
+*
+* @param ignored
+* @throws IOException
+*/
+   @Override
+   public void open(InputSplit ignored) throws IOException {
+   this.session = cluster.connect();
+   MappingManager manager = new MappingManager(session);
 
 Review comment:
   Have made these changes, will be committing them soon once tests are fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-09-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603580#comment-16603580
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

Jicaar commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r215059322
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/CustomCassandraAnnotatedPojo.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import com.datastax.driver.mapping.annotations.Column;
+import com.datastax.driver.mapping.annotations.Table;
+
+/**
+ * Example of Cassandra Annotated POJO class for use with 
CassandraInputFormatter.
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Assignee: Andrey Zagrebin
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570463#comment-16570463
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207951006
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/CustomCassandraAnnotatedPojo.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import com.datastax.driver.mapping.annotations.Column;
+import com.datastax.driver.mapping.annotations.Table;
+
+/**
+ * Example of Cassandra Annotated POJO class for use with 
CassandraInputFormatter.
 
 Review comment:
   I think we can add link to the pojo implementation:
   ... for use with {@link CassandraPojoInputFormat}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570468#comment-16570468
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207952349
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the 
CassandraPojoInput-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }));
+
+   env.execute("Write");
+
+   /*DataSet> inputDS = env
 
 Review comment:
   let's remote the unused code


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570467#comment-16570467
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207954696
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
+
+   this.query = query;
+   this.builder = builder;
+   this.inputClass = inputClass;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   /**
+* Opens a Session and executes the query.
+*
+* @param ignored
+* @throws IOException
+*/
+   @Override
+   public void open(InputSplit ignored) throws IOException {
+   this.session = cluster.connect();
+   MappingManager manager = new MappingManager(session);
+
+   Mapper mapper = manager.mapper(inputClass);
 
 Review comment:
   I would also at least overload the `CassandraPojoInputFormat` constructer 
with additional parameter which is a factory for `Mapper` because now it 
is hidden in the internals of this class but users might want to configure it 
in a special way (e.g. `Mapper.setDefaultSaveOptions(...)`), similar to 
`ClusterBuilder`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570462#comment-16570462
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207947002
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
 
 Review comment:
   `serialVersionUID` field should be added, as it is a `Serializable` class


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570464#comment-16570464
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207947856
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
+
+   this.query = query;
+   this.builder = builder;
+   this.inputClass = inputClass;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   /**
+* Opens a Session and executes the query.
+*
+* @param ignored
 
 Review comment:
   I would add comment to `ignored`, that it is ignored because the input is 
not parallelizable. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570466#comment-16570466
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207950599
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
+   Preconditions.checkArgument(inputClass != null, "InputClass 
cannot be null");
+
+   this.query = query;
+   this.builder = builder;
+   this.inputClass = inputClass;
+   }
+
+   @Override
+   public void configure(Configuration parameters) {
+   this.cluster = builder.getCluster();
+   }
+
+   @Override
+   public BaseStatistics getStatistics(BaseStatistics cachedStatistics) 
throws IOException {
+   return cachedStatistics;
+   }
+
+   /**
+* Opens a Session and executes the query.
+*
+* @param ignored
+* @throws IOException
+*/
+   @Override
+   public void open(InputSplit ignored) throws IOException {
+   this.session = cluster.connect();
+   MappingManager manager = new MappingManager(session);
 
 Review comment:
   This is basically the only important place along with `nextRecord` method 
where it differs from the tuple version `CassandraInputFormat`.
   
   I would suggest refactoring of both formats to have one abstract base class 
and override relevant methods. In general only the mapping logic differs. This 
way we can avoid code duplication and increase testability.
   
   Maybe it would make sense to create mapper abstraction: from Cassandra row 
to some object, in this case to tuple or pojo, and configure the base format 
class with the corresponding mapper in constructors of concrete formats.


This is an automated message from the Apache Git Service.

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570461#comment-16570461
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207947260
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/batch/connectors/cassandra/CassandraPojoInputFormat.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra;
+
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.NonParallelInput;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Strings;
+
+import com.datastax.driver.core.Cluster;
+import com.datastax.driver.core.Session;
+import com.datastax.driver.mapping.Mapper;
+import com.datastax.driver.mapping.MappingManager;
+import com.datastax.driver.mapping.Result;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+
+/**
+ * InputFormat to read data from Apache Cassandra and generate a custom 
Cassandra annotated object.
+ * @param  type of inputClass
+ */
+public class CassandraPojoInputFormat extends RichInputFormat implements NonParallelInput {
+   private static final Logger LOG = 
LoggerFactory.getLogger(CassandraInputFormat.class);
+
+   private final String query;
+   private final ClusterBuilder builder;
+
+   private transient Cluster cluster;
+   private transient Session session;
+   private transient Result resultSet;
+   private Class inputClass;
+
+   public CassandraPojoInputFormat(String query, ClusterBuilder builder, 
Class inputClass) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(query), 
"Query cannot be null or empty");
+   Preconditions.checkArgument(builder != null, "Builder cannot be 
null");
 
 Review comment:
   I would consider `Preconditions.checkNotNull` for readability


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: InputFormat, cassandra, features, pull-request-available
> Fix For: 1.7.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570465#comment-16570465
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

azagrebin commented on a change in pull request #5876: [FLINK-9126] New 
CassandraPojoInputFormat to output data as a custom annotated Cassandra Pojo
URL: https://github.com/apache/flink/pull/5876#discussion_r207953634
 
 

 ##
 File path: 
flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/example/BatchPojoExample.java
 ##
 @@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.batch.connectors.cassandra.example;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.batch.connectors.cassandra.CassandraPojoInputFormat;
+import org.apache.flink.batch.connectors.cassandra.CassandraTupleOutputFormat;
+import 
org.apache.flink.batch.connectors.cassandra.CustomCassandraAnnotatedPojo;
+import org.apache.flink.streaming.connectors.cassandra.ClusterBuilder;
+
+import com.datastax.driver.core.Cluster;
+
+import java.util.ArrayList;
+
+/**
+ * This is an example showing the to use the 
CassandraPojoInput-/CassandraOutputFormats in the Batch API.
+ *
+ * The example assumes that a table exists in a local cassandra database, 
according to the following queries:
+ * CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': ‘1’};
+ * CREATE TABLE IF NOT EXISTS test.batches (number int, strings text, PRIMARY 
KEY(number, strings));
+ */
+public class BatchPojoExample {
+   private static final String INSERT_QUERY = "INSERT INTO test.batches 
(number, strings) VALUES (?,?);";
+   private static final String SELECT_QUERY = "SELECT number, strings FROM 
test.batches;";
+
+   /*
+*  table script: "CREATE TABLE test.batches (number int, strings 
text, PRIMARY KEY(number, strings));"
+*/
+   public static void main(String[] args) throws Exception {
+
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   env.setParallelism(1);
+
+   ArrayList> collection = new 
ArrayList<>(20);
+   for (int i = 0; i < 20; i++) {
+   collection.add(new Tuple2<>(i, "string " + i));
+   }
+
+   DataSet> dataSet = 
env.fromCollection(collection);
+
+   dataSet.output(new CassandraTupleOutputFormat>(INSERT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder builder) 
{
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }));
+
+   env.execute("Write");
+
+   /*DataSet> inputDS = env
+   .createInput(new CassandraInputFormat>(SELECT_QUERY, new ClusterBuilder() {
+   @Override
+   protected Cluster buildCluster(Cluster.Builder 
builder) {
+   return 
builder.addContactPoints("127.0.0.1").build();
+   }
+   }), TupleTypeInfo.of(new TypeHint>() {
+   }));
+
+   inputDS.print();*/
+
+   DataSet inputDS = env
+   .createInput(new 
CassandraPojoInputFormat<>(SELECT_QUERY, new ClusterBuilder() {
 
 Review comment:
   Is it work in progress? it seems the declared table in the example comment 
(number, string) does not correspond to the used for querying pojo 
`CustomCassandraAnnotatedPojo` (id, counter, batch_id). Would it be possible to 
make the example fully working?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please 

[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-04-18 Thread Jeffrey Carter (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443125#comment-16443125
 ] 

Jeffrey Carter commented on FLINK-9126:
---

Pull request for this feature has been submitted.

> Creation of the CassandraPojoInputFormat class to output data into a Custom 
> Cassandra Annotated Pojo
> 
>
> Key: FLINK-9126
> URL: https://issues.apache.org/jira/browse/FLINK-9126
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API
>Affects Versions: 1.4.2
>Reporter: Jeffrey Carter
>Priority: Minor
>  Labels: easyfix, features, newbie
> Fix For: 1.6.0
>
> Attachments: CassandraPojoInputFormatText.rtf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> *First time proposing new update so apologies if I missed anything*
> Currently the DataSet API only has the ability to output data received from 
> Cassandra as a source in as a Tuple. This would be allow the data to be 
> output as a custom POJO that the user has created that has been annotated 
> using Datastax API. This would remove the need of  very long Tuples to be 
> created by the DataSet and then mapped to the custom POJO.
>  
> -The changes to the CassandraInputFormat object would be minimal, but would 
> require importing the Datastax API into the class-. Another option is to make 
> a similar, but slightly different class called CassandraPojoInputFormat.
> I have already gotten code for this working in my own project, but want other 
> thoughts as to the best way this should go about being implemented.
>  
> //Example of its use in main
> CassandraPojoInputFormat cassandraInputFormat = new 
> CassandraPojoInputFormat<>(queryToRun, defaultClusterBuilder, 
> CustomCassandraPojo.class);
>  cassandraInputFormat.configure(null);
>  cassandraInputFormat.open(null);
> DataSet outputTestSet = 
> exEnv.createInput(cassandraInputFormat, TypeInformation.of(new 
> TypeHint(){}));
>  
> //The class that I currently have set up
> [^CassandraPojoInputFormatText.rtf]
>  
> Will make another Jira Issue for the Output version next if this is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9126) Creation of the CassandraPojoInputFormat class to output data into a Custom Cassandra Annotated Pojo

2018-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443124#comment-16443124
 ] 

ASF GitHub Bot commented on FLINK-9126:
---

GitHub user Jicaar opened a pull request:

https://github.com/apache/flink/pull/5876

[FLINK-9126] New CassandraPojoInputFormat to output data as a custom 
annotated Cassandra Pojo

## What is the purpose of the change

Committing the new CassandraPojoInputFormat class. This works similarly to 
the CassandraInputClass, but allows the data that is read from Cassandra to be 
output as a custom POJO that the user has created whichhas been annotated using 
Datastax API. 

## Brief change log

-Initial commit of the CassandraPojoInputFormat class and validation test.

## Verifying this change

-CassandraPojoInputFormat can be validated with the 
testCassandraBatchPojoFormat test in the CassandraConnectorITCaseTest.java file.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): no
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): don't know
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? yes
  - If yes, how is the feature documented? JavaDocs


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Jicaar/flink master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5876.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5876


commit 5a9f1795a61c7f27b2e5170c275a10e8ce269d77
Author: Jicaar 
Date:   2018-04-09T19:25:56Z

Also adding in code for CassandraPojoInputFormat class in connection with 
Jira task FLINK-9126
Merge remote-tracking branch 'upstream/master'

# Conflicts:
#   README.md
#   
flink-annotations/src/main/java/org/apache/flink/annotation/docs/ConfigGroup.java
#   
flink-annotations/src/main/java/org/apache/flink/annotation/docs/ConfigGroups.java
#   
flink-connectors/flink-connector-kafka-0.11/src/test/java/org/apache/flink/streaming/connectors/kafka/Kafka011ITCase.java
#   
flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java
#   
flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaTestBase.java
#   
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java
#   
flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
#   flink-docs/pom.xml
#   
flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
#   
flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
#   flink-libraries/flink-sql-client/conf/sql-client-defaults.yaml
#   
flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/config/Environment.java
#   
flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/ExecutionContext.java
#   
flink-libraries/flink-sql-client/src/test/resources/test-sql-client-defaults.yaml
#   
flink-libraries/flink-sql-client/src/test/resources/test-sql-client-factory.yaml
#   
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableDescriptorValidator.scala
#   
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/descriptors/TableSourceDescriptor.scala
#   
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/sources/TableSourceFactoryServiceTest.scala
#   
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/sources/TestTableSourceFactory.scala
#   
flink-queryable-state/flink-queryable-state-client-java/src/main/java/org/apache/flink/queryablestate/network/messages/MessageType.java
#   
flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/BootstrapTools.java
#   
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/NettyConfig.java
#   
flink-runtime/src/main/java/org/apache/flink/runtime/query/netty/message/KvStateRequestType.java
#   
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerConfiguration.java
#   
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
#