[jira] [Commented] (FLINK-9637) Add public user documentation for TTL feature

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563163#comment-16563163
 ] 

ASF GitHub Bot commented on FLINK-9637:
---

bowenli86 commented on a change in pull request #6379: [FLINK-9637] Add public 
user documentation for state TTL feature
URL: https://github.com/apache/flink/pull/6379#discussion_r206399972
 
 

 ##
 File path: docs/dev/stream/state/state.md
 ##
 @@ -266,6 +266,92 @@ a `ValueState`. Once the count reaches 2 it will emit the 
average and clear the
 we start over from `0`. Note that this would keep a different state value for 
each different input
 key if we had tuples with different values in the first field.
 
+### State time-to-live (TTL)
+
+A time-to-live (TTL) can be assigned to the keyed state value. 
+In this case it will expire after the configured TTL
+and its stored value will be cleaned up based on the best effort.
+Depending on configuration, the expired state can become unavailable for read 
access
+even if it is not cleaned up yet. In this case it behaves as if it does not 
exist any more.
+
+The collection types of state support TTL on entry level: 
+separate list elements and map entries expire independently. 
+
+The behaviour of state with TTL firstly should be configured by building 
`StateTtlConfiguration`:
+
+
+
+{% highlight java %}
+StateTtlConfiguration ttlConfig = StateTtlConfiguration
+.newBuilder(Time.seconds(1))
+.setTtlUpdateType(StateTtlConfiguration.TtlUpdateType.OnCreateAndWrite)
 
 Review comment:
   My gut feeling is that the class names are quite verbose currently - like 
duplicated 'Ttl' in the namings. If it's not too late for release 1.6.0, would 
be nice to shorten it to something like 
`StateTtlConfig.UpdateType.OnCreateAndWrite`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add public user documentation for TTL feature
> -
>
> Key: FLINK-9637
> URL: https://issues.apache.org/jira/browse/FLINK-9637
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Affects Versions: 1.6.0
>Reporter: Andrey Zagrebin
>Assignee: Andrey Zagrebin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bowenli86 commented on a change in pull request #6379: [FLINK-9637] Add public user documentation for state TTL feature

2018-07-30 Thread GitBox
bowenli86 commented on a change in pull request #6379: [FLINK-9637] Add public 
user documentation for state TTL feature
URL: https://github.com/apache/flink/pull/6379#discussion_r206399972
 
 

 ##
 File path: docs/dev/stream/state/state.md
 ##
 @@ -266,6 +266,92 @@ a `ValueState`. Once the count reaches 2 it will emit the 
average and clear the
 we start over from `0`. Note that this would keep a different state value for 
each different input
 key if we had tuples with different values in the first field.
 
+### State time-to-live (TTL)
+
+A time-to-live (TTL) can be assigned to the keyed state value. 
+In this case it will expire after the configured TTL
+and its stored value will be cleaned up based on the best effort.
+Depending on configuration, the expired state can become unavailable for read 
access
+even if it is not cleaned up yet. In this case it behaves as if it does not 
exist any more.
+
+The collection types of state support TTL on entry level: 
+separate list elements and map entries expire independently. 
+
+The behaviour of state with TTL firstly should be configured by building 
`StateTtlConfiguration`:
+
+
+
+{% highlight java %}
+StateTtlConfiguration ttlConfig = StateTtlConfiguration
+.newBuilder(Time.seconds(1))
+.setTtlUpdateType(StateTtlConfiguration.TtlUpdateType.OnCreateAndWrite)
 
 Review comment:
   My gut feeling is that the class names are quite verbose currently - like 
duplicated 'Ttl' in the namings. If it's not too late for release 1.6.0, would 
be nice to shorten it to something like 
`StateTtlConfig.UpdateType.OnCreateAndWrite`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9926) Allow for ShardConsumer override in Kinesis consumer

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563161#comment-16563161
 ] 

ASF GitHub Bot commented on FLINK-9926:
---

tzulitai commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409101923
 
 
   Thanks all for the PR and reviews.
   +1, merging this ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow for ShardConsumer override in Kinesis consumer
> 
>
> Key: FLINK-9926
> URL: https://issues.apache.org/jira/browse/FLINK-9926
> Project: Flink
>  Issue Type: Task
>  Components: Kinesis Connector
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>  Labels: pull-request-available
>
> There are various reasons why the user may want to override the consumer. 
> Examples are to optimize the run loop or to add additional metrics or 
> logging. Instead of baking the constructor into runFetcher, create a 
> customizable factory method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tzulitai commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis consumer.

2018-07-30 Thread GitBox
tzulitai commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409101923
 
 
   Thanks all for the PR and reviews.
   +1, merging this ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563155#comment-16563155
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

tzulitai commented on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409101468
 
 
   +1, thanks for the work @glaksh100 and @tweise for the help on the reviews.
   Merging this to master and 1.6 ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.4.2, 1.5.1
>Reporter: Lakshmi Rao
>Assignee: Lakshmi Rao
>Priority: Major
>  Labels: pull-request-available
>
> In FLINK-9692, we introduced the ability for the shardConsumer to adaptively 
> read more records based on the current average record size to optimize the 2 
> Mb/sec shard limit. The feature maximizes  maxNumberOfRecordsPerFetch of 5 
> reads/sec (as prescribed by Kinesis limits). In the case where applications 
> take more time to process records in the run loop, they are no longer able to 
> read at a frequency of 5 reads/sec (even though their fetchIntervalMillis 
> maybe set to 200 ms). In such a scenario, the maxNumberOfRecordsPerFetch 
> should be calculated based on the time that the run loop actually takes as 
> opposed to fetchIntervalMillis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tzulitai commented on issue #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
tzulitai commented on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409101468
 
 
   +1, thanks for the work @glaksh100 and @tweise for the help on the reviews.
   Merging this to master and 1.6 ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9926) Allow for ShardConsumer override in Kinesis consumer

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563112#comment-16563112
 ] 

ASF GitHub Bot commented on FLINK-9926:
---

tweise commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409083557
 
 
   @tzulitai it would be great if you could merge this PR. We also need it for 
   https://issues.apache.org/jira/browse/FLINK-4582


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow for ShardConsumer override in Kinesis consumer
> 
>
> Key: FLINK-9926
> URL: https://issues.apache.org/jira/browse/FLINK-9926
> Project: Flink
>  Issue Type: Task
>  Components: Kinesis Connector
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>  Labels: pull-request-available
>
> There are various reasons why the user may want to override the consumer. 
> Examples are to optimize the run loop or to add additional metrics or 
> logging. Instead of baking the constructor into runFetcher, create a 
> customizable factory method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tweise commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis consumer.

2018-07-30 Thread GitBox
tweise commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409083557
 
 
   @tzulitai it would be great if you could merge this PR. We also need it for 
   https://issues.apache.org/jira/browse/FLINK-4582


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-8037) Missing cast in integer arithmetic in TransactionalIdsGenerator#generateIdsToAbort

2018-07-30 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-8037:
--
Labels: kafka kafka-connect  (was: kafka-connect)

> Missing cast in integer arithmetic in 
> TransactionalIdsGenerator#generateIdsToAbort
> --
>
> Key: FLINK-8037
> URL: https://issues.apache.org/jira/browse/FLINK-8037
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Greg Hogan
>Priority: Minor
>  Labels: kafka, kafka-connect
>
> {code}
>   public Set generateIdsToAbort() {
> Set idsToAbort = new HashSet<>();
> for (int i = 0; i < safeScaleDownFactor; i++) {
>   idsToAbort.addAll(generateIdsToUse(i * poolSize * 
> totalNumberOfSubtasks));
> {code}
> The operands are integers where generateIdsToUse() expects long parameter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9970) Add ASCII/CHR function for table/sql API

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562994#comment-16562994
 ] 

ASF GitHub Bot commented on FLINK-9970:
---

yanghua commented on a change in pull request #6432: [FLINK-9970] Add ASCII/CHR 
function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206376636
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   I know that in normal scenarios, we should return an int, but we need to 
handle the input exception properly. We need to return null in the case where 
the input is null, and return "" if the input is "". If we return the int type 
directly, then we have difficulty dealing with these two exception scenarios. 
So I chose to return a string representation of the numeric type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add ASCII/CHR function for table/sql API
> 
>
> Key: FLINK-9970
> URL: https://issues.apache.org/jira/browse/FLINK-9970
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
>
> for ASCII function : 
> refer to : 
> [https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_ascii]
> for CHR function : 
> This function convert ASCII code to a character,
> refer to : [https://doc.ispirer.com/sqlways/Output/SQLWays-1-071.html]
> Considering "CHAR" always is a keyword in many database, so we use "CHR" 
> keyword.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yanghua commented on a change in pull request #6432: [FLINK-9970] Add ASCII/CHR function for table/sql API

2018-07-30 Thread GitBox
yanghua commented on a change in pull request #6432: [FLINK-9970] Add ASCII/CHR 
function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206376636
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   I know that in normal scenarios, we should return an int, but we need to 
handle the input exception properly. We need to return null in the case where 
the input is null, and return "" if the input is "". If we return the int type 
directly, then we have difficulty dealing with these two exception scenarios. 
So I chose to return a string representation of the numeric type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9407) Support orc rolling sink writer

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562980#comment-16562980
 ] 

ASF GitHub Bot commented on FLINK-9407:
---

zhangminglei commented on a change in pull request #6075: [FLINK-9407] [hdfs 
connector] Support orc rolling sink writer
URL: https://github.com/apache/flink/pull/6075#discussion_r206374395
 
 

 ##
 File path: 
flink-connectors/flink-orc/src/main/java/org/apache/flink/orc/OrcFileWriter.java
 ##
 @@ -0,0 +1,269 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.orc;
+
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.streaming.connectors.fs.StreamWriterBase;
+import org.apache.flink.streaming.connectors.fs.Writer;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.CompressionKind;
+import org.apache.orc.OrcFile;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.orc.OrcBatchReader.schemaToTypeInfo;
+
+/**
+ * A {@link Writer} that writes the bucket files as Hadoop {@link OrcFile}.
+ *
+ * @param  The type of the elements that are being written by the sink.
+ */
+public class OrcFileWriter extends StreamWriterBase {
+
+   private static final long serialVersionUID = 3L;
+
+   /**
+* The description of the types in an ORC file.
+*/
+   private TypeDescription schema;
+
+   /**
+* The schema of an ORC file.
+*/
+   private String metaSchema;
+
+   /**
+* A row batch that will be written to the ORC file.
+*/
+   private VectorizedRowBatch rowBatch;
+
+   /**
+* The writer that fill the records into the batch.
+*/
+   private OrcBatchWriter orcBatchWriter;
+
+   private transient org.apache.orc.Writer writer;
+
+   private CompressionKind compressionKind;
+
+   /**
+* The number of rows that currently being written.
+*/
+   private long writedRowSize;
+
+   /**
+* Creates a new {@code OrcFileWriter} that writes orc files without 
compression.
+*
+* @param metaSchema The orc schema.
+*/
+   public OrcFileWriter(String metaSchema) {
+   this(metaSchema, CompressionKind.NONE);
+   }
+
+   /**
+* Create a new {@code OrcFileWriter} that writes orc file with the 
gaven
+* schema and compression kind.
+*
+* @param metaSchema  The schema of an orc file.
+* @param compressionKind The compression kind to use.
+*/
+   public OrcFileWriter(String metaSchema, CompressionKind 
compressionKind) {
+   this.metaSchema = metaSchema;
+   this.schema = TypeDescription.fromString(metaSchema);
+   this.compressionKind = compressionKind;
+   }
+
+   @Override
+   public void open(FileSystem fs, Path path) throws IOException {
+   writer = OrcFile.createWriter(path, 
OrcFile.writerOptions(fs.getConf()).setSchema(schema).compress(compressionKind));
+   rowBatch = schema.createRowBatch();
+   orcBatchWriter = new 
OrcBatchWriter(Arrays.asList(orcSchemaToTableSchema(schema).getTypes()));
+   }
+
+   private TableSchema orcSchemaToTableSchema(TypeDescription orcSchema) {
+   List fieldNames = orcSchema.getFieldNames();
+   List typeDescriptions = 
orcSchema.getChildren();
+   List typeInformations = new ArrayList<>();
+
+   

[GitHub] zhangminglei commented on a change in pull request #6075: [FLINK-9407] [hdfs connector] Support orc rolling sink writer

2018-07-30 Thread GitBox
zhangminglei commented on a change in pull request #6075: [FLINK-9407] [hdfs 
connector] Support orc rolling sink writer
URL: https://github.com/apache/flink/pull/6075#discussion_r206374395
 
 

 ##
 File path: 
flink-connectors/flink-orc/src/main/java/org/apache/flink/orc/OrcFileWriter.java
 ##
 @@ -0,0 +1,269 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.orc;
+
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.streaming.connectors.fs.StreamWriterBase;
+import org.apache.flink.streaming.connectors.fs.Writer;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.CompressionKind;
+import org.apache.orc.OrcFile;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.orc.OrcBatchReader.schemaToTypeInfo;
+
+/**
+ * A {@link Writer} that writes the bucket files as Hadoop {@link OrcFile}.
+ *
+ * @param  The type of the elements that are being written by the sink.
+ */
+public class OrcFileWriter extends StreamWriterBase {
+
+   private static final long serialVersionUID = 3L;
+
+   /**
+* The description of the types in an ORC file.
+*/
+   private TypeDescription schema;
+
+   /**
+* The schema of an ORC file.
+*/
+   private String metaSchema;
+
+   /**
+* A row batch that will be written to the ORC file.
+*/
+   private VectorizedRowBatch rowBatch;
+
+   /**
+* The writer that fill the records into the batch.
+*/
+   private OrcBatchWriter orcBatchWriter;
+
+   private transient org.apache.orc.Writer writer;
+
+   private CompressionKind compressionKind;
+
+   /**
+* The number of rows that currently being written.
+*/
+   private long writedRowSize;
+
+   /**
+* Creates a new {@code OrcFileWriter} that writes orc files without 
compression.
+*
+* @param metaSchema The orc schema.
+*/
+   public OrcFileWriter(String metaSchema) {
+   this(metaSchema, CompressionKind.NONE);
+   }
+
+   /**
+* Create a new {@code OrcFileWriter} that writes orc file with the 
gaven
+* schema and compression kind.
+*
+* @param metaSchema  The schema of an orc file.
+* @param compressionKind The compression kind to use.
+*/
+   public OrcFileWriter(String metaSchema, CompressionKind 
compressionKind) {
+   this.metaSchema = metaSchema;
+   this.schema = TypeDescription.fromString(metaSchema);
+   this.compressionKind = compressionKind;
+   }
+
+   @Override
+   public void open(FileSystem fs, Path path) throws IOException {
+   writer = OrcFile.createWriter(path, 
OrcFile.writerOptions(fs.getConf()).setSchema(schema).compress(compressionKind));
+   rowBatch = schema.createRowBatch();
+   orcBatchWriter = new 
OrcBatchWriter(Arrays.asList(orcSchemaToTableSchema(schema).getTypes()));
+   }
+
+   private TableSchema orcSchemaToTableSchema(TypeDescription orcSchema) {
+   List fieldNames = orcSchema.getFieldNames();
+   List typeDescriptions = 
orcSchema.getChildren();
+   List typeInformations = new ArrayList<>();
+
+   typeDescriptions.forEach(typeDescription -> {
+   typeInformations.add(schemaToTypeInfo(typeDescription));
+   });
+
+   return new TableSchema(
+   fieldNames.toArray(new String[fieldNames.size()]),
+  

[jira] [Comment Edited] (FLINK-9150) Prepare for Java 10

2018-07-30 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473198#comment-16473198
 ] 

Ted Yu edited comment on FLINK-9150 at 7/31/18 1:49 AM:


Similar error is encountered when building against jdk 11.


was (Author: yuzhih...@gmail.com):
Similar error is encountered when building against jdk 11 .

> Prepare for Java 10
> ---
>
> Key: FLINK-9150
> URL: https://issues.apache.org/jira/browse/FLINK-9150
> Project: Flink
>  Issue Type: Task
>  Components: Build System
>Reporter: Ted Yu
>Priority: Major
>
> Java 9 is not a LTS release.
> When compiling with Java 10, I see the following compilation error:
> {code}
> [ERROR] Failed to execute goal on project flink-shaded-hadoop2: Could not 
> resolve dependencies for project 
> org.apache.flink:flink-shaded-hadoop2:jar:1.6-SNAPSHOT: Could not find 
> artifact jdk.tools:jdk.tools:jar:1.6 at specified path 
> /a/jdk-10/../lib/tools.jar -> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562745#comment-16562745
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

glaksh100 edited a comment on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409059048
 
 
   @tzulitai Thanks for reviewing and apologies for the delay. I have addressed 
all the comments.
   I verified the most recent changes with a Flink job reading from a 
back-logged Kinesis stream and writing to a no-op sink. 
   - Number of Kinesis shards - 512
   - GetRecord I/O achieved ~= 1.04 Gb/sec
   - 
[numberOfAggregatedRecordsPerFetch](https://github.com/apache/flink/pull/6409) 
~= `maxNumberOfRecordsPerFetch`

   Let me know if it looks alright! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.4.2, 1.5.1
>Reporter: Lakshmi Rao
>Assignee: Lakshmi Rao
>Priority: Major
>  Labels: pull-request-available
>
> In FLINK-9692, we introduced the ability for the shardConsumer to adaptively 
> read more records based on the current average record size to optimize the 2 
> Mb/sec shard limit. The feature maximizes  maxNumberOfRecordsPerFetch of 5 
> reads/sec (as prescribed by Kinesis limits). In the case where applications 
> take more time to process records in the run loop, they are no longer able to 
> read at a frequency of 5 reads/sec (even though their fetchIntervalMillis 
> maybe set to 200 ms). In such a scenario, the maxNumberOfRecordsPerFetch 
> should be calculated based on the time that the run loop actually takes as 
> opposed to fetchIntervalMillis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] glaksh100 edited a comment on issue #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
glaksh100 edited a comment on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409059048
 
 
   @tzulitai Thanks for reviewing and apologies for the delay. I have addressed 
all the comments.
   I verified the most recent changes with a Flink job reading from a 
back-logged Kinesis stream and writing to a no-op sink. 
   - Number of Kinesis shards - 512
   - GetRecord I/O achieved ~= 1.04 Gb/sec
   - 
[numberOfAggregatedRecordsPerFetch](https://github.com/apache/flink/pull/6409) 
~= `maxNumberOfRecordsPerFetch`

   Let me know if it looks alright! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9926) Allow for ShardConsumer override in Kinesis consumer

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562737#comment-16562737
 ] 

ASF GitHub Bot commented on FLINK-9926:
---

yxu-valleytider commented on a change in pull request #6427: 
[FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis 
consumer.
URL: https://github.com/apache/flink/pull/6427#discussion_r206369575
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java
 ##
 @@ -204,7 +214,7 @@ public KinesisDataFetcher(List streams,
new AtomicReference<>(),
new ArrayList<>(),

createInitialSubscribedStreamsToLastDiscoveredShardsState(streams),
-   KinesisProxy.create(configProps));
+   KinesisProxy::create);
 
 Review comment:
   awesome double-colon usage!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow for ShardConsumer override in Kinesis consumer
> 
>
> Key: FLINK-9926
> URL: https://issues.apache.org/jira/browse/FLINK-9926
> Project: Flink
>  Issue Type: Task
>  Components: Kinesis Connector
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>  Labels: pull-request-available
>
> There are various reasons why the user may want to override the consumer. 
> Examples are to optimize the run loop or to add additional metrics or 
> logging. Instead of baking the constructor into runFetcher, create a 
> customizable factory method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yxu-valleytider commented on a change in pull request #6427: [FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis consumer.

2018-07-30 Thread GitBox
yxu-valleytider commented on a change in pull request #6427: 
[FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis 
consumer.
URL: https://github.com/apache/flink/pull/6427#discussion_r206369575
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java
 ##
 @@ -204,7 +214,7 @@ public KinesisDataFetcher(List streams,
new AtomicReference<>(),
new ArrayList<>(),

createInitialSubscribedStreamsToLastDiscoveredShardsState(streams),
-   KinesisProxy.create(configProps));
+   KinesisProxy::create);
 
 Review comment:
   awesome double-colon usage!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] glaksh100 commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for ShardConsumer override in Kinesis consumer.

2018-07-30 Thread GitBox
glaksh100 commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409059743
 
 
   +1 from my side. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9926) Allow for ShardConsumer override in Kinesis consumer

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562725#comment-16562725
 ] 

ASF GitHub Bot commented on FLINK-9926:
---

glaksh100 commented on issue #6427: [FLINK-9926][Kinesis Connector] Allow for 
ShardConsumer override in Kinesis consumer.
URL: https://github.com/apache/flink/pull/6427#issuecomment-409059743
 
 
   +1 from my side. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow for ShardConsumer override in Kinesis consumer
> 
>
> Key: FLINK-9926
> URL: https://issues.apache.org/jira/browse/FLINK-9926
> Project: Flink
>  Issue Type: Task
>  Components: Kinesis Connector
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>  Labels: pull-request-available
>
> There are various reasons why the user may want to override the consumer. 
> Examples are to optimize the run loop or to add additional metrics or 
> logging. Instead of baking the constructor into runFetcher, create a 
> customizable factory method.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562723#comment-16562723
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

glaksh100 commented on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409059048
 
 
   @tzulitai Thanks for reviewing and apologies for the delay. I have addressed 
all the comments. Let me know if it looks alright! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.4.2, 1.5.1
>Reporter: Lakshmi Rao
>Assignee: Lakshmi Rao
>Priority: Major
>  Labels: pull-request-available
>
> In FLINK-9692, we introduced the ability for the shardConsumer to adaptively 
> read more records based on the current average record size to optimize the 2 
> Mb/sec shard limit. The feature maximizes  maxNumberOfRecordsPerFetch of 5 
> reads/sec (as prescribed by Kinesis limits). In the case where applications 
> take more time to process records in the run loop, they are no longer able to 
> read at a frequency of 5 reads/sec (even though their fetchIntervalMillis 
> maybe set to 200 ms). In such a scenario, the maxNumberOfRecordsPerFetch 
> should be calculated based on the time that the run loop actually takes as 
> opposed to fetchIntervalMillis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] glaksh100 commented on issue #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
glaksh100 commented on issue #6408: [FLINK-9897][Kinesis Connector] Make 
adaptive reads depend on run loop time instead of fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#issuecomment-409059048
 
 
   @tzulitai Thanks for reviewing and apologies for the delay. I have addressed 
all the comments. Let me know if it looks alright! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562699#comment-16562699
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206363113
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
+
+   long adjustmentEndTimeNanos = 
adjustRunLoopFrequency(processingStartTimeNanos, processingEndTimeNanos);
+   long runLoopTimeNanos = 
adjustmentEndTimeNanos - processingStartTimeNanos;
+   maxNumberOfRecordsPerFetch = 
adaptRecordsToRead(runLoopTimeNanos, fetchedRecords.size(), 
recordBatchSizeBytes);
+   processingStartTimeNanos = 
adjustmentEndTimeNanos; // for next time through the loop
}
}
} catch (Throwable t) {
fetcherRef.stopWithError(t);
}
}
 
+   /**
+* Adjusts loop timing to match target frequency if specified.
+* @param processingStartTimeNanos The start time of the run loop "work"
+* @param processingEndTimeNanos The end time of the run loop "work"
+* @return The System.nanoTime() after the sleep (if any)
+* @throws InterruptedException
+*/
+   protected long adjustRunLoopFrequency(long processingStartTimeNanos, 
long processingEndTimeNanos)
+   throws InterruptedException {
+   long endTimeNanos = processingEndTimeNanos;
+   if (fetchIntervalMillis != 0) {
+   long processingTimeNanos = processingEndTimeNanos - 
processingStartTimeNanos;
+   long sleepTimeMillis = fetchIntervalMillis - 
(processingTimeNanos / 1_000_000);
+   if (sleepTimeMillis > 0) {
+   Thread.sleep(sleepTimeMillis);
+   endTimeNanos = System.nanoTime();
+   }
+   }
+   return endTimeNanos;
+   }
+
+   /**
+* Calculates how many records to read each time through the loop based 
on a target throughput
+* and the measured frequenecy of the loop.
+* @param runLoopTimeNanos The total time of one pass through the loop
+* @param numRecords The number of records of the last read operation
+* @param recordBatchSizeBytes The total batch size of the last read 
operation
+*/
+   protected int adaptRecordsToRead(long runLoopTimeNanos, int numRecords, 
long recordBatchSizeBytes) {
 
 Review comment:
   I modified this to pass the `maxNumberOfRecordsPerFetch` as an argument. Let 
me know if that's the change you had in mind.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  

[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562698#comment-16562698
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206363100
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
 
 Review comment:
   Removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.4.2, 1.5.1
>Reporter: Lakshmi Rao
>Assignee: Lakshmi Rao
>Priority: Major
>  Labels: pull-request-available
>
> In FLINK-9692, we introduced the ability for the shardConsumer to adaptively 
> read more records based on the current average record size to optimize the 2 
> Mb/sec shard limit. The feature maximizes  maxNumberOfRecordsPerFetch of 5 
> reads/sec (as prescribed by Kinesis limits). In the case where applications 
> take more time to process records in the run loop, they are no longer able to 
> read at a frequency of 5 reads/sec (even though their fetchIntervalMillis 
> maybe set to 200 ms). In such a scenario, the maxNumberOfRecordsPerFetch 
> should be calculated based on the time that the run loop actually takes as 
> opposed to fetchIntervalMillis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206363113
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
+
+   long adjustmentEndTimeNanos = 
adjustRunLoopFrequency(processingStartTimeNanos, processingEndTimeNanos);
+   long runLoopTimeNanos = 
adjustmentEndTimeNanos - processingStartTimeNanos;
+   maxNumberOfRecordsPerFetch = 
adaptRecordsToRead(runLoopTimeNanos, fetchedRecords.size(), 
recordBatchSizeBytes);
+   processingStartTimeNanos = 
adjustmentEndTimeNanos; // for next time through the loop
}
}
} catch (Throwable t) {
fetcherRef.stopWithError(t);
}
}
 
+   /**
+* Adjusts loop timing to match target frequency if specified.
+* @param processingStartTimeNanos The start time of the run loop "work"
+* @param processingEndTimeNanos The end time of the run loop "work"
+* @return The System.nanoTime() after the sleep (if any)
+* @throws InterruptedException
+*/
+   protected long adjustRunLoopFrequency(long processingStartTimeNanos, 
long processingEndTimeNanos)
+   throws InterruptedException {
+   long endTimeNanos = processingEndTimeNanos;
+   if (fetchIntervalMillis != 0) {
+   long processingTimeNanos = processingEndTimeNanos - 
processingStartTimeNanos;
+   long sleepTimeMillis = fetchIntervalMillis - 
(processingTimeNanos / 1_000_000);
+   if (sleepTimeMillis > 0) {
+   Thread.sleep(sleepTimeMillis);
+   endTimeNanos = System.nanoTime();
+   }
+   }
+   return endTimeNanos;
+   }
+
+   /**
+* Calculates how many records to read each time through the loop based 
on a target throughput
+* and the measured frequenecy of the loop.
+* @param runLoopTimeNanos The total time of one pass through the loop
+* @param numRecords The number of records of the last read operation
+* @param recordBatchSizeBytes The total batch size of the last read 
operation
+*/
+   protected int adaptRecordsToRead(long runLoopTimeNanos, int numRecords, 
long recordBatchSizeBytes) {
 
 Review comment:
   I modified this to pass the `maxNumberOfRecordsPerFetch` as an argument. Let 
me know if that's the change you had in mind.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206363100
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
 
 Review comment:
   Removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9897) Further enhance adaptive reads in Kinesis Connector to depend on run loop time

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562695#comment-16562695
 ] 

ASF GitHub Bot commented on FLINK-9897:
---

glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206362309
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
+
+   long adjustmentEndTimeNanos = 
adjustRunLoopFrequency(processingStartTimeNanos, processingEndTimeNanos);
 
 Review comment:
   Changed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Further enhance adaptive reads in Kinesis Connector to depend on run loop time
> --
>
> Key: FLINK-9897
> URL: https://issues.apache.org/jira/browse/FLINK-9897
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.4.2, 1.5.1
>Reporter: Lakshmi Rao
>Assignee: Lakshmi Rao
>Priority: Major
>  Labels: pull-request-available
>
> In FLINK-9692, we introduced the ability for the shardConsumer to adaptively 
> read more records based on the current average record size to optimize the 2 
> Mb/sec shard limit. The feature maximizes  maxNumberOfRecordsPerFetch of 5 
> reads/sec (as prescribed by Kinesis limits). In the case where applications 
> take more time to process records in the run loop, they are no longer able to 
> read at a frequency of 5 reads/sec (even though their fetchIntervalMillis 
> maybe set to 200 ms). In such a scenario, the maxNumberOfRecordsPerFetch 
> should be calculated based on the time that the run loop actually takes as 
> opposed to fetchIntervalMillis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis Connector] Make adaptive reads depend on run loop time instead of fetchintervalmillis

2018-07-30 Thread GitBox
glaksh100 commented on a change in pull request #6408: [FLINK-9897][Kinesis 
Connector] Make adaptive reads depend on run loop time instead of 
fetchintervalmillis
URL: https://github.com/apache/flink/pull/6408#discussion_r206362309
 
 

 ##
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
 ##
 @@ -233,26 +225,69 @@ public void run() {

subscribedShard.getShard().getHashKeyRange().getEndingHashKey());
 
long recordBatchSizeBytes = 0L;
-   long averageRecordSizeBytes = 0L;
-
for (UserRecord record : 
fetchedRecords) {
recordBatchSizeBytes += 
record.getData().remaining();

deserializeRecordForCollectionAndUpdateState(record);
}
 
-   if (useAdaptiveReads && 
!fetchedRecords.isEmpty()) {
-   averageRecordSizeBytes = 
recordBatchSizeBytes / fetchedRecords.size();
-   maxNumberOfRecordsPerFetch = 
getAdaptiveMaxRecordsPerFetch(averageRecordSizeBytes);
-   }
-
nextShardItr = 
getRecordsResult.getNextShardIterator();
+
+   long processingEndTimeNanos = 
System.nanoTime();
+
+   long adjustmentEndTimeNanos = 
adjustRunLoopFrequency(processingStartTimeNanos, processingEndTimeNanos);
 
 Review comment:
   Changed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9970) Add ASCII/CHR function for table/sql API

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562628#comment-16562628
 ] 

ASF GitHub Bot commented on FLINK-9970:
---

suez1224 commented on a change in pull request #6432: [FLINK-9970] Add 
ASCII/CHR function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206346433
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   Should it return int type instead?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add ASCII/CHR function for table/sql API
> 
>
> Key: FLINK-9970
> URL: https://issues.apache.org/jira/browse/FLINK-9970
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
>
> for ASCII function : 
> refer to : 
> [https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_ascii]
> for CHR function : 
> This function convert ASCII code to a character,
> refer to : [https://doc.ispirer.com/sqlways/Output/SQLWays-1-071.html]
> Considering "CHAR" always is a keyword in many database, so we use "CHR" 
> keyword.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9970) Add ASCII/CHR function for table/sql API

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562630#comment-16562630
 ] 

ASF GitHub Bot commented on FLINK-9970:
---

suez1224 commented on a change in pull request #6432: [FLINK-9970] Add 
ASCII/CHR function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206346433
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   Should it return int type instead like in transact SQL, 
https://docs.microsoft.com/en-us/sql/t-sql/functions/ascii-transact-sql?view=sql-server-2017?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add ASCII/CHR function for table/sql API
> 
>
> Key: FLINK-9970
> URL: https://issues.apache.org/jira/browse/FLINK-9970
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
>
> for ASCII function : 
> refer to : 
> [https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_ascii]
> for CHR function : 
> This function convert ASCII code to a character,
> refer to : [https://doc.ispirer.com/sqlways/Output/SQLWays-1-071.html]
> Considering "CHAR" always is a keyword in many database, so we use "CHR" 
> keyword.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] suez1224 commented on a change in pull request #6432: [FLINK-9970] Add ASCII/CHR function for table/sql API

2018-07-30 Thread GitBox
suez1224 commented on a change in pull request #6432: [FLINK-9970] Add 
ASCII/CHR function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206346433
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   Should it return int type instead like in transact SQL, 
https://docs.microsoft.com/en-us/sql/t-sql/functions/ascii-transact-sql?view=sql-server-2017?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] suez1224 commented on a change in pull request #6432: [FLINK-9970] Add ASCII/CHR function for table/sql API

2018-07-30 Thread GitBox
suez1224 commented on a change in pull request #6432: [FLINK-9970] Add 
ASCII/CHR function for table/sql API
URL: https://github.com/apache/flink/pull/6432#discussion_r206346433
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/calls/FunctionGenerator.scala
 ##
 @@ -146,12 +146,25 @@ object FunctionGenerator {
 STRING_TYPE_INFO,
 BuiltInMethod.OVERLAY.method)
 
+  addSqlFunctionMethod(
+ASCII,
+Seq(STRING_TYPE_INFO),
+STRING_TYPE_INFO,
 
 Review comment:
   Should it return int type instead?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9977) Refine the docs for Table/SQL built-in functions

2018-07-30 Thread Xingcan Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562210#comment-16562210
 ] 

Xingcan Cui commented on FLINK-9977:


Hi [~twalthr], I've refined the documents for SQL built-in functions (see [this 
branch|http://example.comhttps//github.com/xccui/flink/tree/FLINK-9977-udfdocs]).
 Please take a look when it's convenient for you. There are some problems 
raised during the process.

1. It seems that the escape character has not been supported in functions such 
as "LIKE".
2. Why LOG(B, X) doesn't support B < 1? E.g., LOG(0.01, 0.1) should return 2.0.
3. The functions treat NULL in different ways (e.g., ABS(NULL) returns NULL, 
while LOG10(NULL) throws an NPE), which may confuse the users. I wonder whether 
we should make them unified.
4. It's a little weird that the two mathematical constants, PI and E, are 
accessed in different ways – PI and E(). I'm not sure if there are some reasons 
for Calcite to take PI as a {{SqlBaseContextVariable}} like "USER" and 
"CURRENT_ROLE".
5. IMO, the BIN function, which returns a string representation of an integer 
numeric in binary format, should be classified to a string function.
6. I think the "INTERVAL string range" in temporal function should be a kind of 
literal notation instead of a function.
7. TIMESTAMPADD(MINUTE, 1, DATE '2016-06-15')", "2016-06-16" will throw an 
ClassCastException( java.lang.Integer cannot be cast to java.lang.Long), which 
seems to be a bug. (It tries to cast an integer date to a long timestamp in 
RexBuilder.java:1524 - return TimestampString.fromMillisSinceEpoch((Long) o)).
8. SqlExpressionTest.scala only covers the tests for part of the functions. Not 
sure if we should complete it.
9. I cannot think out what does the "[, value ]*" in "COUNT( [ ALL | DISTINCT ] 
value [, value ]*)" mean, thus just temporarily removed it.
10. The table API doesn't support distinct aggregation.
11. The style for  is not so distinctive.

> Refine the docs for Table/SQL built-in functions
> 
>
> Key: FLINK-9977
> URL: https://issues.apache.org/jira/browse/FLINK-9977
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Minor
>
> There exist some syntax errors or inconsistencies in documents and Scala docs 
> of the Table/SQL built-in functions. This issue aims to make some 
> improvements to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9875) Add concurrent creation of execution job vertex

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562203#comment-16562203
 ] 

ASF GitHub Bot commented on FLINK-9875:
---

StephanEwen commented on issue #6353: [FLINK-9875][runtime] Add concurrent 
creation of execution job vertex
URL: https://github.com/apache/flink/pull/6353#issuecomment-408947775
 
 
   I would suggest to first drive a discussion about what the interface for the 
creation of input splits would be. The implementation follows that interface.
   
   Design before implementation is always a good idea, before such efforts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add concurrent creation of execution job vertex
> ---
>
> Key: FLINK-9875
> URL: https://issues.apache.org/jira/browse/FLINK-9875
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 1.5.1
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
>
> in some case like inputformat vertex, creation of execution job vertex is time
> consuming, this pr add concurrent creation of execution job vertex to 
> accelerate it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StephanEwen commented on issue #6353: [FLINK-9875][runtime] Add concurrent creation of execution job vertex

2018-07-30 Thread GitBox
StephanEwen commented on issue #6353: [FLINK-9875][runtime] Add concurrent 
creation of execution job vertex
URL: https://github.com/apache/flink/pull/6353#issuecomment-408947775
 
 
   I would suggest to first drive a discussion about what the interface for the 
creation of input splits would be. The implementation follows that interface.
   
   Design before implementation is always a good idea, before such efforts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562202#comment-16562202
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408947692
 
 
   This works fine state.backend.rocksdb.localdir = /some/file/path. At least 
there is no (relative path)exception thrown if we give the config in the above 
mentioned way. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408947692
 
 
   This works fine state.backend.rocksdb.localdir = /some/file/path. At least 
there is no (relative path)exception thrown if we give the config in the above 
mentioned way. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562186#comment-16562186
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

StephanEwen commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408945091
 
 
   The character `:` is actually the standard UNIX path separator. Not 
supporting this would be kind of strange and break existing setups. We should 
retain that.
   
   Having simply not URIs with scheme in the config, but only the scheme, does 
that work?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StephanEwen commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
StephanEwen commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408945091
 
 
   The character `:` is actually the standard UNIX path separator. Not 
supporting this would be kind of strange and break existing setups. We should 
retain that.
   
   Having simply not URIs with scheme in the config, but only the scheme, does 
that work?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-3930) Implement Service-Level Authorization

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  closed FLINK-3930.
---
Resolution: Won't Fix

_This issue relates to the obsolete plan to use a 'shared secret' for client 
authentication.  Instead, SSL mutual authentication was implemented._

> Implement Service-Level Authorization
> -
>
> Key: FLINK-3930
> URL: https://issues.apache.org/jira/browse/FLINK-3930
> Project: Flink
>  Issue Type: New Feature
>  Components: Security
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>Priority: Major
>  Labels: security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Service-level authorization is the initial authorization mechanism to ensure 
> clients (or servers) connecting to the Flink cluster are authorized to do so. 
>   The purpose is to prevent a cluster from being used by an unauthorized 
> user, whether to execute jobs, disrupt cluster functionality, or gain access 
> to secrets stored within the cluster.
> Implement service-level authorization as described in the design doc.
> - Introduce a shared secret cookie
> - Enable Akka security cookie
> - Implement data transfer authentication
> - Secure the web dashboard



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-4919) Add secure cookie support for the cluster deployed in Mesos environment

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  closed FLINK-4919.
---
Resolution: Won't Fix

_This issue relates to the obsolete plan to use a 'shared secret' for client 
authentication.  Instead, SSL mutual authentication was implemented._

> Add secure cookie support for the cluster deployed in Mesos environment
> ---
>
> Key: FLINK-4919
> URL: https://issues.apache.org/jira/browse/FLINK-4919
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos, Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-4637) Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  closed FLINK-4637.
---
Resolution: Won't Fix

_This issue relates to the obsolete plan to use a 'shared secret' for client 
authentication.  Instead, SSL mutual authentication was implemented._

> Address Yarn proxy incompatibility with Flink Web UI when service level 
> authorization is enabled
> 
>
> Key: FLINK-4637
> URL: https://issues.apache.org/jira/browse/FLINK-4637
> Project: Flink
>  Issue Type: Task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>Priority: Major
>
> When service level authorization is enabled (FLINK-3930), the tracking URL 
> (Yarn RM Proxy) is not forwarding the secure cookie and as a result, the 
> Flink Web UI cannot be accessed through the proxy layer. Current workaround 
> is to use the direct Flink Web URL instead of navigating through proxy. This 
> JIRA should address the Yarn proxy/secure cookie navigation issue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-4919) Add secure cookie support for the cluster deployed in Mesos environment

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  updated FLINK-4919:

Issue Type: Sub-task  (was: Task)
Parent: FLINK-3930

> Add secure cookie support for the cluster deployed in Mesos environment
> ---
>
> Key: FLINK-4919
> URL: https://issues.apache.org/jira/browse/FLINK-4919
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos, Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-4635) Implement Data Transfer Authentication using shared secret configuration

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  closed FLINK-4635.
---
Resolution: Won't Fix

_This issue relates to the obsolete plan to use a 'shared secret' for client 
authentication.  Instead, SSL mutual authentication was implemented._

> Implement Data Transfer Authentication using shared secret configuration
> 
>
> Key: FLINK-4635
> URL: https://issues.apache.org/jira/browse/FLINK-4635
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Priority: Major
>
> The data transfer authentication (TM/Netty) requirement was not addressed as 
> part of FLINK-3930 and this JIRA is created to track the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-4635) Implement Data Transfer Authentication using shared secret configuration

2018-07-30 Thread Eron Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eron Wright  reassigned FLINK-4635:
---

Assignee: (was: Vijay Srinivasaraghavan)

> Implement Data Transfer Authentication using shared secret configuration
> 
>
> Key: FLINK-4635
> URL: https://issues.apache.org/jira/browse/FLINK-4635
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Priority: Major
>
> The data transfer authentication (TM/Netty) requirement was not addressed as 
> part of FLINK-3930 and this JIRA is created to track the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9998) FlinkKafkaConsumer produces lag -Inf when the pipeline lags

2018-07-30 Thread Julio Biason (JIRA)
Julio Biason created FLINK-9998:
---

 Summary: FlinkKafkaConsumer produces lag -Inf when the pipeline 
lags
 Key: FLINK-9998
 URL: https://issues.apache.org/jira/browse/FLINK-9998
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Affects Versions: 1.4.0
Reporter: Julio Biason


I reported this in the list, but now I have enough information to understand 
what's going on.

Sometimes, the kafkaConsumer will report a lag 
(flink_taskmanager_job_task_operator_KafkaConsumer_records_lag_max) of -Inf.

The problem seems to related to the capture time.

If the pipeline (defines with EXACTLY_ONCE) starts lagging at some point, there 
won't be enough information in a certain period and the reported lag becomes 
-Inf.

Example: We had an external SQL Sink, pointing to a RDS source, but with a 
cluster outside AWS. This produced a flush time of about 2 minutes for 500 
records (captured 'cause we added a metric around `upload.executeBatch()` 
inside JDBCOutputFormat); although absurd (which is another problem), during 
this time, the metric would report `-Inf` and return the a proper value once 
the stream finished.

So it seems the lag, instead of being a captured value and kept in memory, it's 
calculated from time to time instead of being kept in memory and updated from 
time to time (just because there wasn't any record processed in a certain 
period, it doesn't mean the lag went down).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-7243) Add ParquetInputFormat

2018-07-30 Thread Adam Najman (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562125#comment-16562125
 ] 

Adam Najman edited comment on FLINK-7243 at 7/30/18 4:46 PM:
-

Is there an open PR for this I can look at? I'd like to start using Parquet as 
a source for some Flink jobs. 


was (Author: najman):
Is there an open PR for this I can look at? I'd like to start using Parque as a 
source for some Flink jobs. 

> Add ParquetInputFormat
> --
>
> Key: FLINK-7243
> URL: https://issues.apache.org/jira/browse/FLINK-7243
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: Zhenqiu Huang
>Priority: Major
>
> Add a {{ParquetInputFormat}} to read data from a Apache Parquet file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7243) Add ParquetInputFormat

2018-07-30 Thread Adam Najman (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562125#comment-16562125
 ] 

Adam Najman commented on FLINK-7243:


Is there an open PR for this I can look at? I'd like to start using Parque as a 
source for some Flink jobs. 

> Add ParquetInputFormat
> --
>
> Key: FLINK-7243
> URL: https://issues.apache.org/jira/browse/FLINK-7243
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: Zhenqiu Huang
>Priority: Major
>
> Add a {{ParquetInputFormat}} to read data from a Apache Parquet file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562076#comment-16562076
 ] 

ASF GitHub Bot commented on FLINK-9874:
---

florianschmidt1994 edited a comment on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408921235
 
 
   According to [1]`ifconfig` is not part of coreutils, but more like "a 
wrapper to the IPConfiguration agent"[2]. IDK if there is any other version of 
ifconfig out there that could be installed on MacOS, but if it were it would 
possibly break the code.. I'd say the risk of that happening is rather slim 
though
   
   [1] https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands
   [2] https://en.wikipedia.org/wiki/Ifconfig


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] florianschmidt1994 edited a comment on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for macOS

2018-07-30 Thread GitBox
florianschmidt1994 edited a comment on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408921235
 
 
   According to [1]`ifconfig` is not part of coreutils, but more like "a 
wrapper to the IPConfiguration agent"[2]. IDK if there is any other version of 
ifconfig out there that could be installed on MacOS, but if it were it would 
possibly break the code.. I'd say the risk of that happening is rather slim 
though
   
   [1] https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands
   [2] https://en.wikipedia.org/wiki/Ifconfig


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562075#comment-16562075
 ] 

ASF GitHub Bot commented on FLINK-9874:
---

florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408921235
 
 
   According to [1]`ifconfig` is not part of coreutils, but more like "a 
wrapper to the IPConfiguration agent". IDK if there is any other version of 
ifconfig out there that could be installed on MacOS, but if it were it would 
possibly break the code.. I'd say the risk of that happening is rather slim 
though
   
   [1] https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands
   [2] https://en.wikipedia.org/wiki/Ifconfig


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for macOS

2018-07-30 Thread GitBox
florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408921235
 
 
   According to [1]`ifconfig` is not part of coreutils, but more like "a 
wrapper to the IPConfiguration agent". IDK if there is any other version of 
ifconfig out there that could be installed on MacOS, but if it were it would 
possibly break the code.. I'd say the risk of that happening is rather slim 
though
   
   [1] https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands
   [2] https://en.wikipedia.org/wiki/Ifconfig


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562067#comment-16562067
 ] 

ASF GitHub Bot commented on FLINK-9874:
---

NicoK commented on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for 
macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408918418
 
 
   just out of curiosity (and I don't know the details there):
   Does this still work with the gnu versions of these tools installed, e.g. 
via `homebrew`? Because at the moment, this completely depends on the OS 
detection, not on the actual tools.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] NicoK commented on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for macOS

2018-07-30 Thread GitBox
NicoK commented on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for 
macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408918418
 
 
   just out of curiosity (and I don't know the details there):
   Does this still work with the gnu versions of these tools installed, e.g. 
via `homebrew`? Because at the moment, this completely depends on the OS 
detection, not on the actual tools.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9985) Incorrect parameter order in document

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562043#comment-16562043
 ] 

ASF GitHub Bot commented on FLINK-9985:
---

zhangminglei commented on issue #6457: [FLINK-9985] Incorrect parameter order 
in document
URL: https://github.com/apache/flink/pull/6457#issuecomment-408911173
 
 
   Thanks for your contribution! Change looks good to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect parameter order in document
> -
>
> Key: FLINK-9985
> URL: https://issues.apache.org/jira/browse/FLINK-9985
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.1
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/windows.html#incremental-window-aggregation-with-foldfunction
> {code:java}
> public Tuple3 fold(Tuple3 acc, 
> SensorReading s) {
>   Integer cur = acc.getField(2);
>   acc.setField(2, cur + 1); // incorrect parameter order , it should be 
> acc.setField(cur + 1, 2)
>   return acc;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zhangminglei commented on issue #6457: [FLINK-9985] Incorrect parameter order in document

2018-07-30 Thread GitBox
zhangminglei commented on issue #6457: [FLINK-9985] Incorrect parameter order 
in document
URL: https://github.com/apache/flink/pull/6457#issuecomment-408911173
 
 
   Thanks for your contribution! Change looks good to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9995) Jepsen: Clean up Mesos Logs and Working Directory

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562029#comment-16562029
 ] 

ASF GitHub Bot commented on FLINK-9995:
---

GJL opened a new pull request #6458: [FLINK-9995][tests] Improve tearing down 
Mesos.
URL: https://github.com/apache/flink/pull/6458
 
 
   ## What is the purpose of the change
   
   *Clean up Mesos logs and working directory so that we do not run out of disk 
space on the DB nodes while running tests.*
   
   ## Brief change log
 - Clean up logs and Mesos working directory.
 - Kill Mesos processes using grepkill! utility.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
 - *Manually ran Jepsen tests on AWS.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Jepsen: Clean up Mesos Logs and Working Directory
> -
>
> Key: FLINK-9995
> URL: https://issues.apache.org/jira/browse/FLINK-9995
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Major
>  Labels: pull-request-available
>
> When tearing down Mesos, all files in the log and working directory should be 
> cleaned up, or we risk running out of disk space after running enough tests 
> in succession.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9995) Jepsen: Clean up Mesos Logs and Working Directory

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9995:
--
Labels: pull-request-available  (was: )

> Jepsen: Clean up Mesos Logs and Working Directory
> -
>
> Key: FLINK-9995
> URL: https://issues.apache.org/jira/browse/FLINK-9995
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Major
>  Labels: pull-request-available
>
> When tearing down Mesos, all files in the log and working directory should be 
> cleaned up, or we risk running out of disk space after running enough tests 
> in succession.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL opened a new pull request #6458: [FLINK-9995][tests] Improve tearing down Mesos.

2018-07-30 Thread GitBox
GJL opened a new pull request #6458: [FLINK-9995][tests] Improve tearing down 
Mesos.
URL: https://github.com/apache/flink/pull/6458
 
 
   ## What is the purpose of the change
   
   *Clean up Mesos logs and working directory so that we do not run out of disk 
space on the DB nodes while running tests.*
   
   ## Brief change log
 - Clean up logs and Mesos working directory.
 - Kill Mesos processes using grepkill! utility.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
 - *Manually ran Jepsen tests on AWS.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9985) Incorrect parameter order in document

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562019#comment-16562019
 ] 

ASF GitHub Bot commented on FLINK-9985:
---

lsyldliu opened a new pull request #6457: [FLINK-9985] Incorrect parameter 
order in document
URL: https://github.com/apache/flink/pull/6457
 
 
   ## What is the purpose of the change
   
   *(This pull request is aim to fix incorrect parameter order in document 
about windows)*
   
   ## Brief change log
   
   *(fix incorrect parameter order in document about windows)*
   
   ## Verifying this change
   
   *(This change is a trivial rework / code cleanup without any test coverage.)*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: ( no)
 - The runtime per-record code paths (performance sensitive): ( no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect parameter order in document
> -
>
> Key: FLINK-9985
> URL: https://issues.apache.org/jira/browse/FLINK-9985
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.5.1
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/windows.html#incremental-window-aggregation-with-foldfunction
> {code:java}
> public Tuple3 fold(Tuple3 acc, 
> SensorReading s) {
>   Integer cur = acc.getField(2);
>   acc.setField(2, cur + 1); // incorrect parameter order , it should be 
> acc.setField(cur + 1, 2)
>   return acc;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9985) Incorrect parameter order in document

2018-07-30 Thread ASF GitHub Bot (JIRA)


[GitHub] lsyldliu opened a new pull request #6457: [FLINK-9985] Incorrect parameter order in document

2018-07-30 Thread GitBox
lsyldliu opened a new pull request #6457: [FLINK-9985] Incorrect parameter 
order in document
URL: https://github.com/apache/flink/pull/6457
 
 
   ## What is the purpose of the change
   
   *(This pull request is aim to fix incorrect parameter order in document 
about windows)*
   
   ## Brief change log
   
   *(fix incorrect parameter order in document about windows)*
   
   ## Verifying this change
   
   *(This change is a trivial rework / code cleanup without any test coverage.)*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: ( no)
 - The runtime per-record code paths (performance sensitive): ( no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9947) Document unified table sources/sinks/formats

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562009#comment-16562009
 ] 

ASF GitHub Bot commented on FLINK-9947:
---

twalthr commented on a change in pull request #6456: [FLINK-9947] [docs] 
Document unified table sources/sinks/formats
URL: https://github.com/apache/flink/pull/6456#discussion_r206185390
 
 

 ##
 File path: docs/dev/table/sourceSinks.md
 ##
 @@ -22,751 +22,17 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-A `TableSource` provides access to data which is stored in external systems 
(database, key-value store, message queue) or files. After a [TableSource is 
registered in a TableEnvironment](common.html#register-a-tablesource) it can 
accessed by [Table API](tableApi.html) or [SQL](sql.html) queries.
+A `TableSource` provides access to data which is stored in external systems 
(database, key-value store, message queue) or files. After a [TableSource is 
registered in a TableEnvironment](common.html#register-a-tablesource) it can be 
accessed by [Table API](tableApi.html) or [SQL](sql.html) queries.
 
-A TableSink [emits a Table](common.html#emit-a-table) to an external storage 
system, such as a database, key-value store, message queue, or file system (in 
different encodings, e.g., CSV, Parquet, or ORC). 
+A `TableSink` [emits a Table](common.html#emit-a-table) to an external storage 
system, such as a database, key-value store, message queue, or file system (in 
different encodings, e.g., CSV, Parquet, or ORC).
 
-Have a look at the [common concepts and API](common.html) page for details how 
to [register a TableSource](common.html#register-a-tablesource) and how to 
[emit a Table through a TableSink](common.html#emit-a-table).
+A `TableFactory` allows for separating the declaration of a connection to an 
external system from the actual implementation. A table factory creates 
configured instances of table sources and sinks from normalized, string-based 
properties. The properties can be generated programmatically using a 
`Descriptor` or via YAML configuration files for the [SQL 
Client](sqlClient.html).
+
+Have a look at the [common concepts and API](common.html) page for details how 
to [register a TableSource](common.html#register-a-tablesource) and how to 
[emit a Table through a TableSink](common.html#emit-a-table). See the [built-in 
sources, sinks, and formats](connect.md) page for examples how to use factories.
 
 Review comment:
   Will correct this `.md` when addressing the comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Document unified table sources/sinks/formats
> 
>
> Key: FLINK-9947
> URL: https://issues.apache.org/jira/browse/FLINK-9947
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> The recent unification of table sources/sinks/formats needs documentation. I 
> propose a new page that explains the built-in sources, sinks, and formats as 
> well as a page for customization of public interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] twalthr commented on a change in pull request #6456: [FLINK-9947] [docs] Document unified table sources/sinks/formats

2018-07-30 Thread GitBox
twalthr commented on a change in pull request #6456: [FLINK-9947] [docs] 
Document unified table sources/sinks/formats
URL: https://github.com/apache/flink/pull/6456#discussion_r206185390
 
 

 ##
 File path: docs/dev/table/sourceSinks.md
 ##
 @@ -22,751 +22,17 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-A `TableSource` provides access to data which is stored in external systems 
(database, key-value store, message queue) or files. After a [TableSource is 
registered in a TableEnvironment](common.html#register-a-tablesource) it can 
accessed by [Table API](tableApi.html) or [SQL](sql.html) queries.
+A `TableSource` provides access to data which is stored in external systems 
(database, key-value store, message queue) or files. After a [TableSource is 
registered in a TableEnvironment](common.html#register-a-tablesource) it can be 
accessed by [Table API](tableApi.html) or [SQL](sql.html) queries.
 
-A TableSink [emits a Table](common.html#emit-a-table) to an external storage 
system, such as a database, key-value store, message queue, or file system (in 
different encodings, e.g., CSV, Parquet, or ORC). 
+A `TableSink` [emits a Table](common.html#emit-a-table) to an external storage 
system, such as a database, key-value store, message queue, or file system (in 
different encodings, e.g., CSV, Parquet, or ORC).
 
-Have a look at the [common concepts and API](common.html) page for details how 
to [register a TableSource](common.html#register-a-tablesource) and how to 
[emit a Table through a TableSink](common.html#emit-a-table).
+A `TableFactory` allows for separating the declaration of a connection to an 
external system from the actual implementation. A table factory creates 
configured instances of table sources and sinks from normalized, string-based 
properties. The properties can be generated programmatically using a 
`Descriptor` or via YAML configuration files for the [SQL 
Client](sqlClient.html).
+
+Have a look at the [common concepts and API](common.html) page for details how 
to [register a TableSource](common.html#register-a-tablesource) and how to 
[emit a Table through a TableSink](common.html#emit-a-table). See the [built-in 
sources, sinks, and formats](connect.md) page for examples how to use factories.
 
 Review comment:
   Will correct this `.md` when addressing the comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-9947) Document unified table sources/sinks/formats

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9947:
--
Labels: pull-request-available  (was: )

> Document unified table sources/sinks/formats
> 
>
> Key: FLINK-9947
> URL: https://issues.apache.org/jira/browse/FLINK-9947
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> The recent unification of table sources/sinks/formats needs documentation. I 
> propose a new page that explains the built-in sources, sinks, and formats as 
> well as a page for customization of public interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] twalthr opened a new pull request #6456: [FLINK-9947] [docs] Document unified table sources/sinks/formats

2018-07-30 Thread GitBox
twalthr opened a new pull request #6456: [FLINK-9947] [docs] Document unified 
table sources/sinks/formats
URL: https://github.com/apache/flink/pull/6456
 
 
   ## What is the purpose of the change
   
   Documentation for unified table sources, sinks, and formats both for Table 
API & SQL and SQL Client.
   
   
   ## Brief change log
   
   - New `connect` page
   - Adapted `SQL Client` page
   - Adapted `Sources & Sinks` page
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9947) Document unified table sources/sinks/formats

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562004#comment-16562004
 ] 

ASF GitHub Bot commented on FLINK-9947:
---

twalthr opened a new pull request #6456: [FLINK-9947] [docs] Document unified 
table sources/sinks/formats
URL: https://github.com/apache/flink/pull/6456
 
 
   ## What is the purpose of the change
   
   Documentation for unified table sources, sinks, and formats both for Table 
API & SQL and SQL Client.
   
   
   ## Brief change log
   
   - New `connect` page
   - Adapted `SQL Client` page
   - Adapted `Sources & Sinks` page
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Document unified table sources/sinks/formats
> 
>
> Key: FLINK-9947
> URL: https://issues.apache.org/jira/browse/FLINK-9947
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> The recent unification of table sources/sinks/formats needs documentation. I 
> propose a new page that explains the built-in sources, sinks, and formats as 
> well as a page for customization of public interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9997) Improve Expression Reduce

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9997:
--
Labels: pull-request-available  (was: )

> Improve Expression Reduce
> -
>
> Key: FLINK-9997
> URL: https://issues.apache.org/jira/browse/FLINK-9997
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>Priority: Major
>  Labels: pull-request-available
>
> RepressionReduce do not reduce some expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9997) Improve Expression Reduce

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561997#comment-16561997
 ] 

ASF GitHub Bot commented on FLINK-9997:
---

Xpray opened a new pull request #6455: [FLINK-9997][TableAPI & SQL] Improve 
Expression Reduce
URL: https://github.com/apache/flink/pull/6455
 
 
   ## What is the purpose of the change
   Improve Expression Reduce
   
   ## Brief change log
   make Project and Filter to Calc and let ReduceExpression.Calc to reduce 
Calc, then remove Calc if necessary.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
org.apache.flink.table.plan.ExpressionReductionRulesTest.testReduceDeterministicUDF
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector:  no
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no
 - If yes, how is the feature documented?  no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Expression Reduce
> -
>
> Key: FLINK-9997
> URL: https://issues.apache.org/jira/browse/FLINK-9997
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>Priority: Major
>  Labels: pull-request-available
>
> RepressionReduce do not reduce some expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Xpray opened a new pull request #6455: [FLINK-9997][TableAPI & SQL] Improve Expression Reduce

2018-07-30 Thread GitBox
Xpray opened a new pull request #6455: [FLINK-9997][TableAPI & SQL] Improve 
Expression Reduce
URL: https://github.com/apache/flink/pull/6455
 
 
   ## What is the purpose of the change
   Improve Expression Reduce
   
   ## Brief change log
   make Project and Filter to Calc and let ReduceExpression.Calc to reduce 
Calc, then remove Calc if necessary.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
org.apache.flink.table.plan.ExpressionReductionRulesTest.testReduceDeterministicUDF
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector:  no
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  no
 - If yes, how is the feature documented?  no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-9997) Improve Expression Reduce

2018-07-30 Thread Ruidong Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruidong Li updated FLINK-9997:
--
Summary: Improve Expression Reduce  (was: Improve ExpressionReduce)

> Improve Expression Reduce
> -
>
> Key: FLINK-9997
> URL: https://issues.apache.org/jira/browse/FLINK-9997
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>Priority: Major
>
> RepressionReduce do not reduce some expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561897#comment-16561897
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408855934
 
 
   @zentol Could you assign this bug to me. 
   So there are few more changes that can be done, such as 
   1. enhancing the throw statement with appropriate error message along with 
the file path so the error becomes evident.
   I would suggest not to remove File.pathSeperator as it would deny a 
functionality for user. Moreover File.pathSeperator is an accepted way of 
providing multiple file paths. Since the key state.backend.rocksdb.localdir 
expects only local file paths what is the harm in checking for "file".


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408855934
 
 
   @zentol Could you assign this bug to me. 
   So there are few more changes that can be done, such as 
   1. enhancing the throw statement with appropriate error message along with 
the file path so the error becomes evident.
   I would suggest not to remove File.pathSeperator as it would deny a 
functionality for user. Moreover File.pathSeperator is an accepted way of 
providing multiple file paths. Since the key state.backend.rocksdb.localdir 
expects only local file paths what is the harm in checking for "file".


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561884#comment-16561884
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851714
 
 
   Note that this of course could break setups where people actually use `:` as 
a separate :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561885#comment-16561885
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851714
 
 
   Note that this of course could break setups where people actually use `:` as 
a separator :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851714
 
 
   Note that this of course could break setups where people actually use `:` as 
a separator :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zentol commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851714
 
 
   Note that this of course could break setups where people actually use `:` as 
a separate :/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zentol commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851475
 
 
   yes that would be my suggestion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561883#comment-16561883
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408851475
 
 
   yes that would be my suggestion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561875#comment-16561875
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408850327
 
 
   @zentol So you are suggesting to remove file.pathSeperator from the split 
options?
   I agree with you that the solution looks trivial/simple but after adding 
some print statements in the code and building the flink-dist.jar. It was 
evident that the issue is in parsing of file path.
   Like discussed in the user mail list, this issue of "Relative Path not 
supported" has nothing to do with underlying FS such as glusterFS nor any 
dependence on K8s. The same issue can be generated even on stand alone flink 
application on simple *nix server.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408850327
 
 
   @zentol So you are suggesting to remove file.pathSeperator from the split 
options?
   I agree with you that the solution looks trivial/simple but after adding 
some print statements in the code and building the flink-dist.jar. It was 
evident that the issue is in parsing of file path.
   Like discussed in the user mail list, this issue of "Relative Path not 
supported" has nothing to do with underlying FS such as glusterFS nor any 
dependence on K8s. The same issue can be generated even on stand alone flink 
application on simple *nix server.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9996) PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561869#comment-16561869
 ] 

ASF GitHub Bot commented on FLINK-9996:
---

zentol commented on issue #6454: [FLINK-9996][fs] Wrap exceptions during FS 
creation
URL: https://github.com/apache/flink/pull/6454#issuecomment-408849850
 
 
   Naturally the alternative is to update the `FileSystem` docs and update the 
linked test to catch all exceptions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis
> -
>
> Key: FLINK-9996
> URL: https://issues.apache.org/jira/browse/FLINK-9996
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, Tests
>Affects Versions: 1.6.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> https://travis-ci.org/apache/flink/jobs/409590482
> {code}
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.838 sec <<< 
> FAILURE! - in org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase
> testConfigKeysForwarding(org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase)
>   Time elapsed: 1.613 sec  <<< ERROR!
> java.lang.RuntimeException: S3 credentials not configured
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.getAwsCredentialsProvider(PrestoS3FileSystem.java:702)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.createAmazonS3Client(PrestoS3FileSystem.java:628)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.initialize(PrestoS3FileSystem.java:212)
>   at 
> org.apache.flink.runtime.fs.hdfs.AbstractFileSystemFactory.create(AbstractFileSystemFactory.java:56)
>   at 
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:395)
>   at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
>   at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>   at 
> org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase.testConfigKeysForwarding(PrestoS3FileSystemITCase.java:84)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol commented on issue #6454: [FLINK-9996][fs] Wrap exceptions during FS creation

2018-07-30 Thread GitBox
zentol commented on issue #6454: [FLINK-9996][fs] Wrap exceptions during FS 
creation
URL: https://github.com/apache/flink/pull/6454#issuecomment-408849850
 
 
   Naturally the alternative is to update the `FileSystem` docs and update the 
linked test to catch all exceptions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9996) PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561867#comment-16561867
 ] 

ASF GitHub Bot commented on FLINK-9996:
---

zentol opened a new pull request #6454: [FLINK-9996][fs] Wrap exceptions during 
FS creation
URL: https://github.com/apache/flink/pull/6454
 
 
   ## What is the purpose of the change
   
   This PR is a small modification to the hadoop `AbstractFileSystemFactory` 
too wrap all exceptions that occur during the creation of a FileSystem in a 
`IOException`, like we did in the past.
   This behavior was modified in 7be07871c23b56547add4cd85e15b95c757f882b.
   
   ## Verifying this change
   
   * run `PrestoS3FileSystemITCase#testConfigKeysForwarding`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (yes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis
> -
>
> Key: FLINK-9996
> URL: https://issues.apache.org/jira/browse/FLINK-9996
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, Tests
>Affects Versions: 1.6.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> https://travis-ci.org/apache/flink/jobs/409590482
> {code}
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.838 sec <<< 
> FAILURE! - in org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase
> testConfigKeysForwarding(org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase)
>   Time elapsed: 1.613 sec  <<< ERROR!
> java.lang.RuntimeException: S3 credentials not configured
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.getAwsCredentialsProvider(PrestoS3FileSystem.java:702)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.createAmazonS3Client(PrestoS3FileSystem.java:628)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.initialize(PrestoS3FileSystem.java:212)
>   at 
> org.apache.flink.runtime.fs.hdfs.AbstractFileSystemFactory.create(AbstractFileSystemFactory.java:56)
>   at 
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:395)
>   at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
>   at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>   at 
> org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase.testConfigKeysForwarding(PrestoS3FileSystemITCase.java:84)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9996) PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9996:
--
Labels: pull-request-available  (was: )

> PrestoS3FileSystemITCase#testConfigKeysForwarding fails on travis
> -
>
> Key: FLINK-9996
> URL: https://issues.apache.org/jira/browse/FLINK-9996
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, Tests
>Affects Versions: 1.6.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> https://travis-ci.org/apache/flink/jobs/409590482
> {code}
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.838 sec <<< 
> FAILURE! - in org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase
> testConfigKeysForwarding(org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase)
>   Time elapsed: 1.613 sec  <<< ERROR!
> java.lang.RuntimeException: S3 credentials not configured
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.getAwsCredentialsProvider(PrestoS3FileSystem.java:702)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.createAmazonS3Client(PrestoS3FileSystem.java:628)
>   at 
> com.facebook.presto.hive.PrestoS3FileSystem.initialize(PrestoS3FileSystem.java:212)
>   at 
> org.apache.flink.runtime.fs.hdfs.AbstractFileSystemFactory.create(AbstractFileSystemFactory.java:56)
>   at 
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:395)
>   at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
>   at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>   at 
> org.apache.flink.fs.s3presto.PrestoS3FileSystemITCase.testConfigKeysForwarding(PrestoS3FileSystemITCase.java:84)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol opened a new pull request #6454: [FLINK-9996][fs] Wrap exceptions during FS creation

2018-07-30 Thread GitBox
zentol opened a new pull request #6454: [FLINK-9996][fs] Wrap exceptions during 
FS creation
URL: https://github.com/apache/flink/pull/6454
 
 
   ## What is the purpose of the change
   
   This PR is a small modification to the hadoop `AbstractFileSystemFactory` 
too wrap all exceptions that occur during the creation of a FileSystem in a 
`IOException`, like we did in the past.
   This behavior was modified in 7be07871c23b56547add4cd85e15b95c757f882b.
   
   ## Verifying this change
   
   * run `PrestoS3FileSystemITCase#testConfigKeysForwarding`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (yes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-9997) Improve ExpressionReduce

2018-07-30 Thread Ruidong Li (JIRA)
Ruidong Li created FLINK-9997:
-

 Summary: Improve ExpressionReduce
 Key: FLINK-9997
 URL: https://issues.apache.org/jira/browse/FLINK-9997
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Reporter: Ruidong Li
Assignee: Ruidong Li


RepressionReduce do not reduce some expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561856#comment-16561856
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408846731
 
 
   I'm well aware that `:` was explicitly introduced to support multiple file 
paths, my point is that it (in hindsight) wasn't a good choice, as other 
delimiter options exist (`,|;`) that are less ambiguous.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561855#comment-16561855
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408846731
 
 
   I'm well aware that `:` was explicitly introduced to support multiple file 
paths, my point is that it (in hindsight) wasn't a good choice, as other 
options delimiters exist `,|;` that are less ambiguous.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
zentol edited a comment on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408846731
 
 
   I'm well aware that `:` was explicitly introduced to support multiple file 
paths, my point is that it (in hindsight) wasn't a good choice, as other 
delimiter options exist (`,|;`) that are less ambiguous.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zentol commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
zentol commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408846731
 
 
   I'm well aware that `:` was explicitly introduced to support multiple file 
paths, my point is that it (in hindsight) wasn't a good choice, as other 
options delimiters exist `,|;` that are less ambiguous.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561845#comment-16561845
 ] 

ASF GitHub Bot commented on FLINK-9874:
---

florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408844189
 
 
   @NicoK here we go 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561844#comment-16561844
 ] 

ASF GitHub Bot commented on FLINK-9874:
---

florianschmidt1994 opened a new pull request #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453
 
 
   ## What is the purpose of the change
   
   This fixes the set_ssl_conf utility function under macOS. The previous
   version was using `hostname -I` regardless of the OS, but the -I option
   is not available on the BSD version of the hostname command.
   
   This is now fixed by checking for all IPv4 addresses from ifconfig if the
   OS is macOS and formatting the output to be identical to `hostname -I`.
   
   Additionally the filtering of the output is removed so that now all
   ip addresses are appended to the SANSTRING instead of just one.
   
   ## Verifying this change
   - Successfully ran `./run-single-test.sh 
test-scripts/test_batch_allround.sh` (this uses SSL setup) on macOS 10.13.6 
   - Successfully ran `./run-single-test.sh 
test-scripts/test_batch_allround.sh` (this uses SSL setup) on Ubuntu 16.04.4
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9874) set_conf_ssl in E2E tests fails on macOS

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9874:
--
Labels: pull-request-available test-stability  (was: test-stability)

> set_conf_ssl in E2E tests fails on macOS
> 
>
> Key: FLINK-9874
> URL: https://issues.apache.org/jira/browse/FLINK-9874
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.6.0
>Reporter: Florian Schmidt
>Assignee: Florian Schmidt
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.6.0
>
>
> Setting up a cluster with SSL support in the end-to-end tests with 
> `_set_conf_ssl_` will fail under macOS because in the command
> {code:java}
> hostname -I{code}
> is used, but '_-I'_ is not a supported parameter for the hostname command 
> under macOS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for macOS

2018-07-30 Thread GitBox
florianschmidt1994 commented on issue #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453#issuecomment-408844189
 
 
   @NicoK here we go 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] florianschmidt1994 opened a new pull request #6453: [FLINK-9874][E2E Tests] Fix set_ssl_conf for macOS

2018-07-30 Thread GitBox
florianschmidt1994 opened a new pull request #6453: [FLINK-9874][E2E Tests] Fix 
set_ssl_conf for macOS
URL: https://github.com/apache/flink/pull/6453
 
 
   ## What is the purpose of the change
   
   This fixes the set_ssl_conf utility function under macOS. The previous
   version was using `hostname -I` regardless of the OS, but the -I option
   is not available on the BSD version of the hostname command.
   
   This is now fixed by checking for all IPv4 addresses from ifconfig if the
   OS is macOS and formatting the output to be identical to `hostname -I`.
   
   Additionally the filtering of the output is removed so that now all
   ip addresses are appended to the SANSTRING instead of just one.
   
   ## Verifying this change
   - Successfully ran `./run-single-test.sh 
test-scripts/test_batch_allround.sh` (this uses SSL setup) on macOS 10.13.6 
   - Successfully ran `./run-single-test.sh 
test-scripts/test_batch_allround.sh` (this uses SSL setup) on Ubuntu 16.04.4
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9739) Regression in supported filesystems for RocksDB

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561825#comment-16561825
 ] 

ASF GitHub Bot commented on FLINK-9739:
---

sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408839099
 
 
   hi @zentol.  splitting based on : was added to support multiple file paths.
   In windows File.pathSepertor= ";" where as in unix its File.pathSepertor= 
":". 
   The reference you pointed https://github.com/apache/flink/pull/2482 is 
correct but in that code base the implementation of setDbStoragePaths() is 
different from current code base. That is the change in API from 
org.apache.flink.core.fs.Path to java.io.File and the introduction of an extra 
File.isAbsolute() check.
   Due to the above changes in setDbStoragePaths() its not possible to provide 
the configuration like this-
   state.backend.rocksdb.localdir = file:///some/file/path. If I'm not wrong 
this is valid way of providing config. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Regression in supported filesystems for RocksDB
> ---
>
> Key: FLINK-9739
> URL: https://issues.apache.org/jira/browse/FLINK-9739
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> A user reporter on the [mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-in-Flink-1-5-0-tt21173.html]
>  has reported that the {{RocksDBStateBackend}} no longer supports GlusterFS 
> mounted volumes.
> Configuring {{file:///home/abc/share}} lead to an exception that claimed the 
> path to not be absolute.
> This was working fine in 1.4.2.
> In FLINK-6557 the {{RocksDBStateBackend}} was refactored to use java 
> {{Files}} instead of Flink {{Paths}}, potentially causing the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported filesystems for RocksDB

2018-07-30 Thread GitBox
sampathBhat commented on issue #6452: [FLINK-9739] Regression in supported 
filesystems for RocksDB
URL: https://github.com/apache/flink/pull/6452#issuecomment-408839099
 
 
   hi @zentol.  splitting based on : was added to support multiple file paths.
   In windows File.pathSepertor= ";" where as in unix its File.pathSepertor= 
":". 
   The reference you pointed https://github.com/apache/flink/pull/2482 is 
correct but in that code base the implementation of setDbStoragePaths() is 
different from current code base. That is the change in API from 
org.apache.flink.core.fs.Path to java.io.File and the introduction of an extra 
File.isAbsolute() check.
   Due to the above changes in setDbStoragePaths() its not possible to provide 
the configuration like this-
   state.backend.rocksdb.localdir = file:///some/file/path. If I'm not wrong 
this is valid way of providing config. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-9976) Odd signatures for streaming file sink format builders

2018-07-30 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-9976:

Fix Version/s: 1.6.0

> Odd signatures for streaming file sink format builders
> --
>
> Key: FLINK-9976
> URL: https://issues.apache.org/jira/browse/FLINK-9976
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.6.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> There are 2 instances of apparently unnecessary generic parameters in the 
> format builders for the {{StreamingFileSink}}.
> Both these methods have a generic parameter for the BucketID type, however 
> the builder itself already has such a parameter. The methods use unchecked 
> casts to make the types fit, so we should be able to modify the signature to 
> use the builders parameter instead.
> {code}
> public static class RowFormatBuilder extends 
> StreamingFileSink.BucketsBuilder {
> ...
>   public  StreamingFileSink.RowFormatBuilder 
> withBucketerAndPolicy(final Bucketer bucketer, final 
> RollingPolicy policy) {
>   @SuppressWarnings("unchecked")
>   StreamingFileSink.RowFormatBuilder reInterpreted = 
> (StreamingFileSink.RowFormatBuilder) this;
>   reInterpreted.bucketer = Preconditions.checkNotNull(bucketer);
>   reInterpreted.rollingPolicy = 
> Preconditions.checkNotNull(policy);
>   return reInterpreted;
>   }
> ...
> {code}
> {code}
> public static class BulkFormatBuilder extends 
> StreamingFileSink.BucketsBuilder {
> ...
>   public  StreamingFileSink.BulkFormatBuilder 
> withBucketer(Bucketer bucketer) {
>   @SuppressWarnings("unchecked")
>   StreamingFileSink.BulkFormatBuilder reInterpreted = 
> (StreamingFileSink.BulkFormatBuilder) this;
>   reInterpreted.bucketer = Preconditions.checkNotNull(bucketer);
>   return reInterpreted;
>   }
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9936) Mesos resource manager unable to connect to master after failover

2018-07-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561819#comment-16561819
 ] 

ASF GitHub Bot commented on FLINK-9936:
---

GJL commented on issue #6451: [FLINK-9936] Resource manager connect to mesos 
after leadership granted. .
URL: https://github.com/apache/flink/pull/6451#issuecomment-408838604
 
 
   I'll take a look today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Mesos resource manager unable to connect to master after failover
> -
>
> Key: FLINK-9936
> URL: https://issues.apache.org/jira/browse/FLINK-9936
> Project: Flink
>  Issue Type: Bug
>  Components: Mesos, Scheduler
>Affects Versions: 1.5.0, 1.5.1, 1.6.0
>Reporter: Renjie Liu
>Assignee: Renjie Liu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.5.2, 1.6.0
>
>
> When deployed in mesos session cluster mode, the connector monitor keeps 
> reporting unable to connect to mesos after restart. In fact, scheduler driver 
> already connected to mesos master, but when the connected message is lost. 
> This is because leadership is not granted yet and fence id is not set, the 
> rpc service ignores the connected message. So we should connect to mesos 
> master after leadership is granted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >