[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-20 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213008#comment-16213008
 ] 

Vijay Srinivasaraghavan commented on FLINK-7737:


[~pnowojski] Thanks for the fix. It addresses the FS sink operation when any of 
the writer implementation is used (AvroKVSinkWriter, SequenceFileWriter, 
StringWriter). However, the BucketingSink is calling flush() when snapshot is 
being taken (see snapshotState()) which causes sync() to happen frequently. 
Essentially we need to call the sync() only while closing the current part 
file. Instead of calling sync() during flush(), having a separate API call to 
handle sync() might help.

[~ryanehobbs] create() is not modified since SYNC_BLOCK flag is not used. This 
fix addresses the writer implementation directly and hence upon writer close, 
sync() will be invoked when the flag is set (as part of the flush() API call).  

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>Assignee: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.4.0
>
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-17 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208634#comment-16208634
 ] 

Vijay Srinivasaraghavan commented on FLINK-7737:


I am trying to understand the changes that went it as part of the PR 
(https://github.com/apache/flink/pull/4781). I see few FileSystemFactory 
implementations (HDFS, MapR, S3Presto, S3Hadoop, LocalFS) that handles the 
concrete FS invocation (plus configuration/scheme). 

There is no hsync() API call during open() instead we call only hflush() with 
the assumption that data will be appropriately synced to the disk by the 
underlying implementation. Is my understanding right? 

If so, I am under the assumption that for stock HDFS, if the file system is 
created with SYNC_BLOCK flag option, then the blocks will be synced to the disk 
upon close() or else we need to invoke hsync() explicitly. With the current 
changes, if the TMs were to fail and recover, will the data on Hadoop DNs get 
synced to the disk?

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>Priority: Blocker
> Fix For: 1.4.0
>
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-10 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198898#comment-16198898
 ] 

Vijay Srinivasaraghavan commented on FLINK-7737:


I believe hflush() routes the data to DN but is lost since no sync happens to 
the disk (will let Ryan to confirm). 

I think we cannot generalize hsync() call since the `SyncFlag` is NameNode 
specific - 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java#L599

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-02 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189049#comment-16189049
 ] 

Vijay Srinivasaraghavan commented on FLINK-7737:


Some observation with respect to the usage of hflush vs hsync. When using HCFS 
implementation as backed filesystem, only hflush() is invoked since call to 
hsync() happens only when the FSDataOutputStream is instance of 
HdfsDataOutputStream. Due to this fact, we are seeing some data loss when the 
bucketing sink is holding data in pending state and trying to close the stream 
(as part of TM failover recovery).

I do not see any issue in adding another condition to include hsync() call for 
HCFS types (FSDataOutputStream). 

[~rmetzger] Could you please take a look?

hflush() - This API flushes all outstanding data (i.e. the current unfinished 
packet) from the client into the OS buffers on all DataNode replicas.

hsync() - This API flushes the data to the DataNodes, like hflush(), but should 
also force the data to underlying physical storage via fsync (or equivalent).

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-5974) Support Mesos DNS

2017-03-06 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-5974:
--

Assignee: Vijay Srinivasaraghavan  (was: Eron Wright )

> Support Mesos DNS
> -
>
> Key: FLINK-5974
> URL: https://issues.apache.org/jira/browse/FLINK-5974
> Project: Flink
>  Issue Type: Improvement
>  Components: Cluster Management, Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>
> In certain Mesos/DCOS environments, the slave hostnames aren't resolvable.  
> For this and other reasons, Mesos DNS names would ideally be used for 
> communication within the Flink cluster, not the hostname discovered via 
> `InetAddress.getLocalHost`.
> Some parts of Flink are already configurable in this respect, notably 
> `jobmanager.rpc.address`.  However, the Mesos AppMaster doesn't use that 
> setting for everything (e.g. artifact server), it uses the hostname.
> Similarly, the `taskmanager.hostname` setting isn't used in Mesos deployment 
> mode.   To effectively use Mesos DNS, the TM should use 
> `..mesos` as its hostname.   This could be derived 
> from an interpolated configuration string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5538) Config option: Kerberos

2017-02-15 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan closed FLINK-5538.
--
Resolution: Fixed

The PR https://github.com/mesosphere/dcos-flink-service/pull/17 has been 
aproved and merged

> Config option: Kerberos
> ---
>
> Key: FLINK-5538
> URL: https://issues.apache.org/jira/browse/FLINK-5538
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5537) Config option: SSL

2017-02-15 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan closed FLINK-5537.
--

> Config option: SSL
> --
>
> Key: FLINK-5537
> URL: https://issues.apache.org/jira/browse/FLINK-5537
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-5537) Config option: SSL

2017-02-15 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan resolved FLINK-5537.

Resolution: Fixed

The patch has been merged to mesosphere/dcos-flink-service master

> Config option: SSL
> --
>
> Key: FLINK-5537
> URL: https://issues.apache.org/jira/browse/FLINK-5537
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5535) Config option: HDFS

2017-02-15 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan closed FLINK-5535.
--
Resolution: Fixed

The patch has been merged to mesosphere/dcos-flink-service master

> Config option: HDFS
> ---
>
> Key: FLINK-5535
> URL: https://issues.apache.org/jira/browse/FLINK-5535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5593) Service: Modify current dcos-flink implementation to use runit service

2017-02-15 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan closed FLINK-5593.
--
Resolution: Fixed

The patch has been merged to mesosphere/dcos-flink-service master

> Service: Modify current dcos-flink implementation to use runit service
> --
>
> Key: FLINK-5593
> URL: https://issues.apache.org/jira/browse/FLINK-5593
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>
> The current Flink DCOS integration provides basic functionality to run Flink 
> on DCOS using Marathon (https://github.com/mesosphere/dcos-flink-service)  on 
> top of which support for additional security and HDFS configurations needs to 
> be build. We would like to follow the integration aspect similar to how Spark 
> is handled (https://github.com/mesosphere/spark-build/tree/master/docker). 
> The purpose of this JIRA is to port the current changes to use runit service 
> and provide a structure to support including additonal configurations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5535) Config option: HDFS

2017-01-26 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840749#comment-15840749
 ] 

Vijay Srinivasaraghavan commented on FLINK-5535:


PR for this issue is available for review

https://github.com/mesosphere/dcos-flink-service/pull/16

> Config option: HDFS
> ---
>
> Key: FLINK-5535
> URL: https://issues.apache.org/jira/browse/FLINK-5535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5537) Config option: SSL

2017-01-24 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836984#comment-15836984
 ] 

Vijay Srinivasaraghavan commented on FLINK-5537:


PR created for the issue - 
https://github.com/mesosphere/dcos-flink-service/pull/15

> Config option: SSL
> --
>
> Key: FLINK-5537
> URL: https://issues.apache.org/jira/browse/FLINK-5537
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5593) Service: Modify current dcos-flink implementation to use runit service

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832621#comment-15832621
 ] 

Vijay Srinivasaraghavan commented on FLINK-5593:


The draft version of the patch is available for review.

https://github.com/mesosphere/dcos-flink-service/compare/master...vijikarthi:FLINK-5593

> Service: Modify current dcos-flink implementation to use runit service
> --
>
> Key: FLINK-5593
> URL: https://issues.apache.org/jira/browse/FLINK-5593
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>
> The current Flink DCOS integration provides basic functionality to run Flink 
> on DCOS using Marathon (https://github.com/mesosphere/dcos-flink-service)  on 
> top of which support for additional security and HDFS configurations needs to 
> be build. We would like to follow the integration aspect similar to how Spark 
> is handled (https://github.com/mesosphere/spark-build/tree/master/docker). 
> The purpose of this JIRA is to port the current changes to use runit service 
> and provide a structure to support including additonal configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5593) Service: Modify current dcos-flink implementation to use runit service

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan updated FLINK-5593:
---
Summary: Service: Modify current dcos-flink implementation to use runit 
service  (was: Modify current dcos-flink implementation to use runit service)

> Service: Modify current dcos-flink implementation to use runit service
> --
>
> Key: FLINK-5593
> URL: https://issues.apache.org/jira/browse/FLINK-5593
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>
> The current Flink DCOS integration provides basic functionality to run Flink 
> on DCOS using Marathon (https://github.com/mesosphere/dcos-flink-service)  on 
> top of which support for additional security and HDFS configurations needs to 
> be build. We would like to follow the integration aspect similar to how Spark 
> is handled (https://github.com/mesosphere/spark-build/tree/master/docker). 
> The purpose of this JIRA is to port the current changes to use runit service 
> and provide a structure to support including additonal configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5593) Modify current dcos-flink implementation to use runit service

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-5593:
--

 Summary: Modify current dcos-flink implementation to use runit 
service
 Key: FLINK-5593
 URL: https://issues.apache.org/jira/browse/FLINK-5593
 Project: Flink
  Issue Type: Sub-task
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan


The current Flink DCOS integration provides basic functionality to run Flink on 
DCOS using Marathon (https://github.com/mesosphere/dcos-flink-service)  on top 
of which support for additional security and HDFS configurations needs to be 
build. We would like to follow the integration aspect similar to how Spark is 
handled (https://github.com/mesosphere/spark-build/tree/master/docker). The 
purpose of this JIRA is to port the current changes to use runit service and 
provide a structure to support including additonal configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5537) Config option: SSL

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-5537:
--

Assignee: Vijay Srinivasaraghavan

> Config option: SSL
> --
>
> Key: FLINK-5537
> URL: https://issues.apache.org/jira/browse/FLINK-5537
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5538) Config option: Kerberos

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-5538:
--

Assignee: Vijay Srinivasaraghavan

> Config option: Kerberos
> ---
>
> Key: FLINK-5538
> URL: https://issues.apache.org/jira/browse/FLINK-5538
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5535) Config option: HDFS

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-5535:
--

Assignee: Vijay Srinivasaraghavan

> Config option: HDFS
> ---
>
> Key: FLINK-5535
> URL: https://issues.apache.org/jira/browse/FLINK-5535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Mesos
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3932) Implement State Backend Security

2016-11-28 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan closed FLINK-3932.
--
Resolution: Fixed

> Implement State Backend Security
> 
>
> Key: FLINK-3932
> URL: https://issues.apache.org/jira/browse/FLINK-3932
> Project: Flink
>  Issue Type: New Feature
>  Components: Security
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: security
> Fix For: 1.2.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Flink should protect its HA, checkpoint, and savepoint state against 
> unauthorized access.
> As described in the design doc, implement:
> - ZooKeeper authentication w/ Kerberos
> - ZooKeeper authorization (i.e. znode ACLs)
> - Checkpoint/savepoint data protection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5055) Security feature crashes JM for certain Hadoop versions even though using no Kerberos

2016-11-11 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657184#comment-15657184
 ] 

Vijay Srinivasaraghavan commented on FLINK-5055:


Flink security context gets initialized during the application start phase. As 
part of the initialization, the UserGroupInformation (UGI) instance is 
bootstrapped using the Hadoop configuration files (read: HADOOP_CONF_DIR or 
YARN_CONF_DIR environment variable is set). If the hadoop configuration 
(core-site) enables security, then the UGI context uses JAAS module to 
load/login through Kerberos. It appears in this case, the Hadoop configurations 
that got loaded somehow has the security configurations enabled and UGI is 
trying to obtain the identity using keytab cache.

> Security feature crashes JM for certain Hadoop versions even though using no 
> Kerberos
> -
>
> Key: FLINK-5055
> URL: https://issues.apache.org/jira/browse/FLINK-5055
> Project: Flink
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Priority: Critical
> Fix For: 1.2.0
>
>
> A user reported [1] that the {{JobManager}} does not start when using Flink 
> with Hadoop-2.7.0-mapr-1607 and no security activated because of 
> {code}
> javax.security.auth.login.LoginException: Unable to obtain Principal Name for 
> authentication
> at 
> com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:841)
> at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704)
> at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
> {code}
> It seems that this Hadoop version always tries to login via Kerberos even 
> though the user did not activate it and, thus, should use 
> {{AuthenticationMode.SIMPLE}}.
> I'm not really familiar with the security feature, but my understanding is 
> that it should not have any effect on Flink when not activated. I might be 
> wrong here, but if not, then we should fix this problem for 1.2.0 because it 
> prevents people from using Flink.
> [1] 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Flink-using-Yarn-on-MapR-td14484.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4950) Add support to include multiple Yarn application entries in Yarn properties file

2016-10-27 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4950:
--

 Summary: Add support to include multiple Yarn application entries 
in Yarn properties file
 Key: FLINK-4950
 URL: https://issues.apache.org/jira/browse/FLINK-4950
 Project: Flink
  Issue Type: Task
  Components: YARN Client
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan
Priority: Minor


When deploying Flink on Yarn using CLI, Yarn properties file is created in /tmp 
directory and persisted with the application ID along with few other 
properties. 

This JIRA addresses two changes to the current implementation.
1) The properties file should be created in the user home directory so that the 
configurations are not leaked
2) Support multiple application entries in the properties file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4919) Add secure cookie support for the cluster deployed in Mesos environment

2016-10-25 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4919:
--

 Summary: Add secure cookie support for the cluster deployed in 
Mesos environment
 Key: FLINK-4919
 URL: https://issues.apache.org/jira/browse/FLINK-4919
 Project: Flink
  Issue Type: Task
  Components: Security
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4918) Add SSL support to Mesos artifact server

2016-10-25 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4918:
--

 Summary: Add SSL support to Mesos artifact server
 Key: FLINK-4918
 URL: https://issues.apache.org/jira/browse/FLINK-4918
 Project: Flink
  Issue Type: Task
  Components: Security
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4826) Add keytab based kerberos support for Mesos environment

2016-10-13 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan updated FLINK-4826:
---
Summary: Add keytab based kerberos support for Mesos environment  (was: Add 
keytab based kerberos support to run Flink in Mesos environment)

> Add keytab based kerberos support for Mesos environment
> ---
>
> Key: FLINK-4826
> URL: https://issues.apache.org/jira/browse/FLINK-4826
> Project: Flink
>  Issue Type: Task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4826) Add keytab based kerberos support to run Flink in Mesos environment

2016-10-13 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan updated FLINK-4826:
---
Summary: Add keytab based kerberos support to run Flink in Mesos 
environment  (was: Add keytab based kerberos support in Mesos environment)

> Add keytab based kerberos support to run Flink in Mesos environment
> ---
>
> Key: FLINK-4826
> URL: https://issues.apache.org/jira/browse/FLINK-4826
> Project: Flink
>  Issue Type: Task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4826) Add keytab based kerberos support in Mesos environment

2016-10-13 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan updated FLINK-4826:
---
Summary: Add keytab based kerberos support in Mesos environment  (was: Add 
keytab based kerberos support to run Flink in Mesos environment)

> Add keytab based kerberos support in Mesos environment
> --
>
> Key: FLINK-4826
> URL: https://issues.apache.org/jira/browse/FLINK-4826
> Project: Flink
>  Issue Type: Task
>  Components: Security
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4826) Add keytab based kerberos support to run Flink in Mesos environment

2016-10-13 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4826:
--

 Summary: Add keytab based kerberos support to run Flink in Mesos 
environment
 Key: FLINK-4826
 URL: https://issues.apache.org/jira/browse/FLINK-4826
 Project: Flink
  Issue Type: Task
  Components: Security
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4637) Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled

2016-10-06 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15552657#comment-15552657
 ] 

Vijay Srinivasaraghavan commented on FLINK-4637:


Yarn WebAppProxyServlet code passes only few limited HTTP headers in the 
request object while redirecting request from Yarn RM UI to appropriate 
Tracking URL (Flink Web App).  Since the "Authentication" header is not part of 
the pre-configured/hardcoded pass through list, navigating Flink UI from Yarn 
RM UI will not work when secure cookie is enabled. 

https://github.com/apache/hadoop/blob/2e1d0ff4e901b8313c8d71869735b94ed8bc40a0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java#L79

I have also tried adding few response headers (CORS support) upon finding 
missing Authentication header but I am not able to get it work (may be someone 
with AngularJS experience could give it a try). It appears that the support to 
include Authentication header should be handled at the Yarn proxy code. I have 
created a bug (YARN-5712) for someone from Yarn team to respond any alternate 
options.

> Address Yarn proxy incompatibility with Flink Web UI when service level 
> authorization is enabled
> 
>
> Key: FLINK-4637
> URL: https://issues.apache.org/jira/browse/FLINK-4637
> Project: Flink
>  Issue Type: Task
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>
> When service level authorization is enabled (FLINK-3930), the tracking URL 
> (Yarn RM Proxy) is not forwarding the secure cookie and as a result, the 
> Flink Web UI cannot be accessed through the proxy layer. Current workaround 
> is to use the direct Flink Web URL instead of navigating through proxy. This 
> JIRA should address the Yarn proxy/secure cookie navigation issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4637) Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled

2016-10-05 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan updated FLINK-4637:
---
Description: When service level authorization is enabled (FLINK-3930), the 
tracking URL (Yarn RM Proxy) is not forwarding the secure cookie and as a 
result, the Flink Web UI cannot be accessed through the proxy layer. Current 
workaround is to use the direct Flink Web URL instead of navigating through 
proxy. This JIRA should address the Yarn proxy/secure cookie navigation issue.  
 (was: When service level authorization is enabled (FLINK-3930), the tracking 
URL (Yarn RM Proxy) is not forwarding the secure cookie and as a result, the 
Flink Web UI cannot be accessed throgh the proxy layer. Current workaround is 
to use the direct Flink Web URL instead of navigating through proxy. This JIRA 
should address the Yarn proxy/secure cookie navigation issue. )

> Address Yarn proxy incompatibility with Flink Web UI when service level 
> authorization is enabled
> 
>
> Key: FLINK-4637
> URL: https://issues.apache.org/jira/browse/FLINK-4637
> Project: Flink
>  Issue Type: Task
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>
> When service level authorization is enabled (FLINK-3930), the tracking URL 
> (Yarn RM Proxy) is not forwarding the secure cookie and as a result, the 
> Flink Web UI cannot be accessed through the proxy layer. Current workaround 
> is to use the direct Flink Web URL instead of navigating through proxy. This 
> JIRA should address the Yarn proxy/secure cookie navigation issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4667) Yarn Session CLI not listening on correct ZK namespace when HA is enabled to use ZooKeeper backend

2016-09-23 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4667:
--

 Summary: Yarn Session CLI not listening on correct ZK namespace 
when HA is enabled to use ZooKeeper backend
 Key: FLINK-4667
 URL: https://issues.apache.org/jira/browse/FLINK-4667
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan
Priority: Minor


In Yarn mode, when Flink is configured for HA using ZooKeeper backend, the 
leader election listener does not provide correct JM/leader info and will 
timeout since the listener is waiting on default ZK namespace instead of the 
application specific (Application ID)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3670) Kerberos: Improving long-running streaming jobs

2016-09-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan resolved FLINK-3670.

   Resolution: Fixed
Fix Version/s: 1.2.0

Resolved as part of FLINK-3929

> Kerberos: Improving long-running streaming jobs
> ---
>
> Key: FLINK-3670
> URL: https://issues.apache.org/jira/browse/FLINK-3670
> Project: Flink
>  Issue Type: Improvement
>  Components: Client, Local Runtime
>Reporter: Maximilian Michels
>Assignee: Vijay Srinivasaraghavan
> Fix For: 1.2.0
>
>
> We have seen in the past, that Hadoop's delegation tokens are subject to a 
> number of subtle token renewal bugs. In addition, they have a maximum life 
> time that can be worked around but is very inconvenient for the user.
> As per [mailing list 
> discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Kerberos-for-Streaming-amp-Kafka-td10906.html],
>  a way to work around the maximum life time of DelegationTokens would be to 
> pass the Kerberos principal and key tab upon job submission. A daemon could 
> then periodically renew the ticket. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3670) Kerberos: Improving long-running streaming jobs

2016-09-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-3670:
--

Assignee: Vijay Srinivasaraghavan  (was: Eron Wright )

> Kerberos: Improving long-running streaming jobs
> ---
>
> Key: FLINK-3670
> URL: https://issues.apache.org/jira/browse/FLINK-3670
> Project: Flink
>  Issue Type: Improvement
>  Components: Client, Local Runtime
>Reporter: Maximilian Michels
>Assignee: Vijay Srinivasaraghavan
>
> We have seen in the past, that Hadoop's delegation tokens are subject to a 
> number of subtle token renewal bugs. In addition, they have a maximum life 
> time that can be worked around but is very inconvenient for the user.
> As per [mailing list 
> discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Kerberos-for-Streaming-amp-Kafka-td10906.html],
>  a way to work around the maximum life time of DelegationTokens would be to 
> pass the Kerberos principal and key tab upon job submission. A daemon could 
> then periodically renew the ticket. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-09-20 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan resolved FLINK-3239.

   Resolution: Fixed
Fix Version/s: 1.2.0

Resolved as part of FLINK-3929

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Vijay Srinivasaraghavan
> Fix For: 1.2.0
>
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4637) Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled

2016-09-19 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4637:
--

 Summary: Address Yarn proxy incompatibility with Flink Web UI when 
service level authorization is enabled
 Key: FLINK-4637
 URL: https://issues.apache.org/jira/browse/FLINK-4637
 Project: Flink
  Issue Type: Task
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan


When service level authorization is enabled (FLINK-3930), the tracking URL 
(Yarn RM Proxy) is not forwarding the secure cookie and as a result, the Flink 
Web UI cannot be accessed throgh the proxy layer. Current workaround is to use 
the direct Flink Web URL instead of navigating through proxy. This JIRA should 
address the Yarn proxy/secure cookie navigation issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4635) Implement Data Transfer Authentication using shared secret configuration

2016-09-19 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4635:
--

 Summary: Implement Data Transfer Authentication using shared 
secret configuration
 Key: FLINK-4635
 URL: https://issues.apache.org/jira/browse/FLINK-4635
 Project: Flink
  Issue Type: Task
Reporter: Vijay Srinivasaraghavan
Assignee: Vijay Srinivasaraghavan


The data transfer authentication (TM/Netty) requirement was not addressed as 
part of FLINK-3930 and this JIRA is created to track the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3929) Support for Kerberos Authentication with Keytab Credential

2016-05-19 Thread Vijay Srinivasaraghavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Srinivasaraghavan reassigned FLINK-3929:
--

Assignee: Vijay Srinivasaraghavan  (was: Eron Wright )

> Support for Kerberos Authentication with Keytab Credential
> --
>
> Key: FLINK-3929
> URL: https://issues.apache.org/jira/browse/FLINK-3929
> Project: Flink
>  Issue Type: New Feature
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: kerberos, security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Add support for a keytab credential to be associated with the Flink cluster, 
> to facilitate:
> - Kerberos-authenticated data access for connectors
> - Kerberos-authenticated ZooKeeper access
> Support both the standalone and YARN deployment modes.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-29 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264256#comment-15264256
 ] 

Vijay Srinivasaraghavan commented on FLINK-3239:


My bad.. I overlooked the krb file to interpret as keytab. I still believe the 
krb configuration is likely referred from its default install path (for e.g., 
/etc/krb5.conf) and we don't need to pass it. I agree that keytab is required 
in many places and a common approach is (if we plan to accept the keytab from 
user) to copy the keytab in a safe/secure location (spark handles this by 
copying to corresponding job directory in HDFS) from where various components 
could make use of it.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-28 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262538#comment-15262538
 ] 

Vijay Srinivasaraghavan commented on FLINK-3239:


Thanks [~stefanobaghino] 

I believe the kafka consumer principal and keytab infomation will be supplied 
in the JassConfig file for the Kafka client code to peform kerberos 
authentication and we need to pass only the JAAS config file property to the 
kafka clients (producer/consumer). 

I am trying to understand the need for passing the keytab configuration file 
location (public static final String KRB5_CONF_PATH = "krb5.conf.path") ? Could 
you please explain. 

KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/etc/security/keytabs/kafkaClient.keytab"
principal="kafka-cli...@foo.com";
};

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-28 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262341#comment-15262341
 ] 

Vijay Srinivasaraghavan commented on FLINK-3239:


[~stefanobaghino] Could you please share the design proposal or prototype code 
patch?

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3189) Error while parsing job arguments passed by CLI

2016-02-11 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143147#comment-15143147
 ] 

Vijay Srinivasaraghavan commented on FLINK-3189:


What is the right way to pass arguments in web submission client?

I ran below command from CLI and it works fine.
bin/flink run dist/helloworld-1.0.jar -topic topic1 -bootstrap.servers 
10.10.192.227:31000 -zookeeper.connect 10.10.192.226:2181 -group.id dev -fsUri 
hdfs://10.247.10.193/sink

where as in the web client, after I upload the jar file, when I provide below 
argument, I am getting "org.apache.flink.client.cli.CliArgsException: 
Unrecognized option: -topic" error.
--topic topic1 --bootstrap.servers 10.10.192.227:31000 --zookeeper.connect 
10.10.192.226:2181 --group.id dev --fsUri hdfs://10.247.10.193/sink

> Error while parsing job arguments passed by CLI
> ---
>
> Key: FLINK-3189
> URL: https://issues.apache.org/jira/browse/FLINK-3189
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 0.10.1
>Reporter: Filip Leczycki
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 1.0.0, 0.10.2
>
>
> Flink CLI treats job arguments provided in format "-" as its own 
> parameters, which results in errors in execution.
> Example 1:
> call: >bin/flink info myJarFile.jar -f flink -i  -m 1
> error: Unrecognized option: -f
> Example 2:
> Job myJarFile.jar is uploaded to web submission client, flink parameter box 
> is empty
> program arguments box: -f flink -i  -m 1
> error: 
> An unexpected error occurred:
> Unrecognized option: -f
> org.apache.flink.client.cli.CliArgsException: Unrecognized option: -f
>   at 
> org.apache.flink.client.cli.CliFrontendParser.parseInfoCommand(CliFrontendParser.java:296)
>   at org.apache.flink.client.CliFrontend.info(CliFrontend.java:376)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:983)
>   at 
> org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:171)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>   at 
> org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
>   at org.eclipse.jetty.server.Server.handle(Server.java:348)
>   at 
> org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
>   at 
> org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
>   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
>   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
>   at 
> org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
>   at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
>   at java.lang.Thread.run(Thread.java:745)
> Execution of 
> >bin/flink run myJarFile.jar -f flink -i  -m 1  
> works perfectly fine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3189) Error while parsing job arguments passed by CLI

2016-02-11 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143147#comment-15143147
 ] 

Vijay Srinivasaraghavan edited comment on FLINK-3189 at 2/11/16 5:52 PM:
-

What is the right way to pass arguments in web submission client?

I ran below command from CLI and it works fine.
bin/flink run dist/helloworld-1.0.jar -topic topic1 -bootstrap.servers 
10.10.192.227:31000 -zookeeper.connect 10.10.192.226:2181 -group.id dev -fsUri 
hdfs://10.247.10.193/sink

where as in the web client, after I upload the jar file, when I provide below 
argument, I am getting "org.apache.flink.client.cli.CliArgsException: 
Unrecognized option: -topic" error.
-topic topic1 -bootstrap.servers 10.10.192.227:31000 -zookeeper.connect 
10.10.192.226:2181 -group.id dev -fsUri hdfs://10.247.10.193/sink


was (Author: vijikarthi):
What is the right way to pass arguments in web submission client?

I ran below command from CLI and it works fine.
bin/flink run dist/helloworld-1.0.jar -topic topic1 -bootstrap.servers 
10.10.192.227:31000 -zookeeper.connect 10.10.192.226:2181 -group.id dev -fsUri 
hdfs://10.247.10.193/sink

where as in the web client, after I upload the jar file, when I provide below 
argument, I am getting "org.apache.flink.client.cli.CliArgsException: 
Unrecognized option: -topic" error.
--topic topic1 --bootstrap.servers 10.10.192.227:31000 --zookeeper.connect 
10.10.192.226:2181 --group.id dev --fsUri hdfs://10.247.10.193/sink

> Error while parsing job arguments passed by CLI
> ---
>
> Key: FLINK-3189
> URL: https://issues.apache.org/jira/browse/FLINK-3189
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 0.10.1
>Reporter: Filip Leczycki
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 1.0.0, 0.10.2
>
>
> Flink CLI treats job arguments provided in format "-" as its own 
> parameters, which results in errors in execution.
> Example 1:
> call: >bin/flink info myJarFile.jar -f flink -i  -m 1
> error: Unrecognized option: -f
> Example 2:
> Job myJarFile.jar is uploaded to web submission client, flink parameter box 
> is empty
> program arguments box: -f flink -i  -m 1
> error: 
> An unexpected error occurred:
> Unrecognized option: -f
> org.apache.flink.client.cli.CliArgsException: Unrecognized option: -f
>   at 
> org.apache.flink.client.cli.CliFrontendParser.parseInfoCommand(CliFrontendParser.java:296)
>   at org.apache.flink.client.CliFrontend.info(CliFrontend.java:376)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:983)
>   at 
> org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:171)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>   at 
> org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
>   at org.eclipse.jetty.server.Server.handle(Server.java:348)
>   at 
> org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
>   at 
> org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
>   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
>   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
>   at 
> org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
>   at 
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
>   at java.lang.Thread.run(Thread.java:745)
> Execution of 
> >bin/flink run myJarFile.jar -f flink -i  -m 1  
> works perfectly fine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)