[jira] [Created] (FLINK-8848) bin/start-cluster.sh won't start jobmanager on master machine.

2018-03-04 Thread Yesheng Ma (JIRA)
Yesheng Ma created FLINK-8848:
-

 Summary: bin/start-cluster.sh won't start jobmanager on master 
machine.
 Key: FLINK-8848
 URL: https://issues.apache.org/jira/browse/FLINK-8848
 Project: Flink
  Issue Type: Bug
  Components: Cluster Management, Configuration
Affects Versions: 1.4.1
 Environment: I login to a remote Ubuntu 16.04 server via ssh.

 

conf/masters: hostname:port
Reporter: Yesheng Ma


When I execute bin/start-cluster.sh on the master machine, actually the command 
`nohup /bin/bash -l /state/partition1/ysma/flink-1.4.1/bin/jobmanager.sh start 
cluster ...` is exexuted, which does not open the job manager.

I think there might be something wrong with the `-l` argument, since when I use 
the bin/jobmanager.sh start, everything is fine. Kindly point out if I've done 
any configuration wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8849) Wrong link from concepts/runtime to doc on chaining

2018-03-04 Thread Ken Krugler (JIRA)
Ken Krugler created FLINK-8849:
--

 Summary: Wrong link from concepts/runtime to doc on chaining
 Key: FLINK-8849
 URL: https://issues.apache.org/jira/browse/FLINK-8849
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Ken Krugler


On https://ci.apache.org/projects/flink/flink-docs-master/concepts/runtime.html 
there's a link to "chaining docs" that currently points at:

https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groups

but it should link to:

https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/#task-chaining-and-resource-groups




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8850) SQL Client does not support Event-time

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8850:


 Summary: SQL Client does not support Event-time
 Key: FLINK-8850
 URL: https://issues.apache.org/jira/browse/FLINK-8850
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


The SQL client fails with an exception if a table includes a rowtime attribute.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8851) SQL Client fails if same file is used as default and env configuration

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8851:


 Summary: SQL Client fails if same file is used as default and env 
configuration
 Key: FLINK-8851
 URL: https://issues.apache.org/jira/browse/FLINK-8851
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


Specifying the same file as default and environment configuration yields the 
following exception
{code:java}
Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Unexpected exception. This is a bug. Please consider filing an issue.
    at org.apache.flink.table.client.SqlClient.main(SqlClient.java:156)
Caused by: java.lang.UnsupportedOperationException
    at java.util.AbstractMap.put(AbstractMap.java:209)
    at java.util.AbstractMap.putAll(AbstractMap.java:281)
    at 
org.apache.flink.table.client.config.Environment.merge(Environment.java:107)
    at 
org.apache.flink.table.client.gateway.local.LocalExecutor.createEnvironment(LocalExecutor.java:461)
    at 
org.apache.flink.table.client.gateway.local.LocalExecutor.listTables(LocalExecutor.java:203)
    at 
org.apache.flink.table.client.cli.CliClient.callShowTables(CliClient.java:270)
    at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:198)
    at org.apache.flink.table.client.SqlClient.start(SqlClient.java:97)
    at org.apache.flink.table.client.SqlClient.main(SqlClient.java:146){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8852) SQL Client does not work with new FLIP-6 mode

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8852:


 Summary: SQL Client does not work with new FLIP-6 mode
 Key: FLINK-8852
 URL: https://issues.apache.org/jira/browse/FLINK-8852
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


The SQL client does not submit queries to local Flink cluster that runs in 
FLIP-6 mode. It doesn't throw an exception either.

Job submission works if the legacy Flink cluster mode is used (`mode: old`)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8853) SQL Client cannot emit query results that contain a rowtime attribute

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8853:


 Summary: SQL Client cannot emit query results that contain a 
rowtime attribute
 Key: FLINK-8853
 URL: https://issues.apache.org/jira/browse/FLINK-8853
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


Emitting a query result that contains a rowtime attribute fails with the 
following exception:
{code:java}
Caused by: java.lang.ClassCastException: java.sql.Timestamp cannot be cast to 
java.lang.Long
    at 
org.apache.flink.api.common.typeutils.base.LongSerializer.serialize(LongSerializer.java:27)
    at 
org.apache.flink.api.java.typeutils.runtime.RowSerializer.serialize(RowSerializer.java:160)
    at 
org.apache.flink.api.java.typeutils.runtime.RowSerializer.serialize(RowSerializer.java:46)
    at 
org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:125)
    at 
org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30)
    at 
org.apache.flink.streaming.experimental.CollectSink.invoke(CollectSink.java:66)
    ... 44 more{code}
The problem is cause by the {{ResultStore}} which configures the 
{{CollectionSink}} with the field types obtained from the {{TableSchema}}. The 
type of the rowtime field is a {{TimeIndicatorType}} which is serialized as 
Long. However, in the query result it is represented as Timestamp. Hence, the 
type must be replaced by a {{SqlTimeTypeInfo}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8854) Mapping of SchemaValidator.deriveFieldMapping() is incorrect.

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8854:


 Summary: Mapping of SchemaValidator.deriveFieldMapping() is 
incorrect.
 Key: FLINK-8854
 URL: https://issues.apache.org/jira/browse/FLINK-8854
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


The field mapping returned by {{SchemaValidator.deriveFieldMapping()}} is not 
correct.

It should not only include all fields of the table schema, but also all fields 
of the format schema (mapped to themselves). Otherwise, it is not possible to 
use a timestamp extractor on a field that is not in table schema. 

For example this configuration would fail:

{code}
sources:
  - name: TaxiRides
schema:
  - name: rideId
type: LONG
  - name: rowTime
type: TIMESTAMP
rowtime:
  timestamps:
type: "from-field"
from: "rideTime"
  watermarks:
type: "periodic-bounded"
delay: "6"
connector:
  
format:
  property-version: 1
  type: json
  schema: "ROW(rideId LONG, rideTime TIMESTAMP)"
{code}

because {{rideTime}} is not in the table schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8855) SQL client result serving gets stuck in result-mode=table

2018-03-04 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8855:


 Summary: SQL client result serving gets stuck in result-mode=table
 Key: FLINK-8855
 URL: https://issues.apache.org/jira/browse/FLINK-8855
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: Fabian Hueske
 Fix For: 1.5.0


The result serving of a query in {{result-mode=table}} get stuck after some 
time when serving an updating result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8856) Move all interrupt() calls to TaskCanceler

2018-03-04 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-8856:
---

 Summary: Move all interrupt() calls to TaskCanceler
 Key: FLINK-8856
 URL: https://issues.apache.org/jira/browse/FLINK-8856
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Reporter: Stephan Ewen
 Fix For: 1.5.0


We need this to work around the following JVM bug: 
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8138622

To circumvent this problem, the {{TaskCancelerWatchDog}} must not call 
{{interrupt()}} at all, but only join on the executing thread (with timeout) 
and cause a hard exit once cancellation takes to long.

A user affected by this problem reported this in FLINK-8834

Personal note: The Thread.join(...) method unfortunately is not 100% reliable 
as well, because it uses {{System.currentTimeMillis()}} rather than 
{{System.nanoTime()}}. Because of that, sleeps can take overly long when the 
clock is adjusted. I wonder why the JDK authors do not follow their own 
recommendations and use {{System.nanoTime()}} for all relative time measures...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: StreamSQL queriable state

2018-03-04 Thread Renjie Liu
Hi, Timo:
I've read your QueryableStateTableSink implementation and that basically
implementes what I want to do. I also want to extend SQL client so that
user can do point query against the table sink. Do we still need a design
doc for that? It seems that I just need to finish the left part and do some
test against it.

Hi, Stefano:
Your requirement needs some changes to the flink table implementation but I
don't know why you need that? For debugging? The operator state is internal
and subject to optisimation logic, so I think it maybe meanless to expose
that.

On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli 
wrote:

> Hi Timo, Renjie,
>
> What I was thinking did not include the QueryableStateTableSink, but
> rather tap in directly into the state of a streaming operator. Perhaps it
> is the same thing, but just it sounds not intuitive to consider it a sink.
>
> So, we would need a way to configure the environment for the query to
> share the "state name" before the query is executed, and then use this to
> create the hook for the queriable state in the operator. Perhaps extend the
> current codegen and operator implementations to get as a parameter the
> StateDescriptor to be inquired.
>
> Looking forward for the design document, will be happy to give you
> feedback.
>
> Best,
> Stefano
>
> -Original Message-
> From: Renjie Liu [mailto:liurenjie2...@gmail.com]
> Sent: Friday, March 02, 2018 11:42 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Great, thank you.
> I'll start by writing a design doc.
>
> On Fri, Mar 2, 2018 at 6:40 PM Timo Walther  wrote:
>
> > I gave you contributor permissions in Jira. You should be able to
> > assign it to yourself now.
> >
> > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > Hi, Timo:
> > > It seems that I can't assign it to myself. Could you please help to
> > assign
> > > that to me?
> > > My jira username is liurenjie1024 and my email is
> > liurenjie2...@gmail.com
> > >
> > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther 
> wrote:
> > >
> > >> Hi Renjie,
> > >>
> > >> that would be great. There is already a Jira issue for it:
> > >> https://issues.apache.org/jira/browse/FLINK-6968
> > >>
> > >> Feel free to assign it to yourself. You can reuse parts of my code
> > >> if you want. But maybe it would make sense to have a little design
> > >> document first about what we want to support.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > >>> Hi, Timo, I've been planning on the same thing and would like to
> > >> contribute
> > >>> that.
> > >>>
> > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther 
> > wrote:
> > >>>
> >  Hi Stefano,
> > 
> >  yes there are plan in this direction. Actually, I already worked
> >  on
> > such
> >  a QueryableStateTableSink [1] in the past but never finished it
> > because
> >  of priority shifts. Would be great if somebody wants to
> >  contribute
> > this
> >  functionality :)
> > 
> >  Regards,
> >  Timo
> > 
> >  [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > 
> >  Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > > Hi guys,
> > >
> > > I am checking out the queriable state API, and it seems that
> > > most of
> > >> the
> >  tooling is already available. However, the queriable state is
> > available
> >  just for the streaming API, not at the StreamSQL API level. In
> > >> principle,
> >  as the flink-table is aware of the query semantic and data output
> > type,
> > >> it
> >  should be possible to configure the query compilation to nest
> > queriable
> >  state in the process/window functions. Is there any plan in this
> > >> direction?
> > > Best,
> > > Stefano
> > >
> >  --
> > >>> Liu, Renjie
> > >>> Software Engineer, MVAD
> > >>>
> > >> --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> > --
> Liu, Renjie
> Software Engineer, MVAD
>
-- 
Liu, Renjie
Software Engineer, MVAD


Re: Proposal - Change shard discovery in Flink Kinesis Connector to use ListShards

2018-03-04 Thread Bowen Li
If ListShards() gives all the info that Flink needs, +1 on switching.
DescribeStreams() has a limitation of 5 requests/sec, which is pretty
bad

But, I believe the goal of switching APIs should be *making Flink jobs that
read from Kinesis more stable*, rather than having faster shard discovery
rate. The default shard discovery rate is every 10s, which is already very
very fast and can satisfy most Kinesis users, we shouldn't shorten the
default value anymore. Developers who want faster discovery rate than 10s
should overwrite the default value themselves.


I last bumped up AWS SDK version in flink-connector-kinesis. So from my
experience, I'd recommend checking:

   1. make sure the new SDK works with both KCL and KPL in Flink.
   Unfortunately, AWS SDK versions in KCL and KPL are not aligned well...
  1. 1   If necessary, you need to bump their versions too
   2. might need to test it on AWS EMR. AWS SDK is of 1.11.267 in latest
   EMR (5.12.0), need to test compatibility.


Bowen



On Fri, Mar 2, 2018 at 8:47 AM, Thomas Weise  wrote:

> It will be good to be able to use the ListShards API. Are there any
> concerns bumping up the AWS SDK dependency? I see it was last done in
> https://issues.apache.org/jira/browse/FLINK-7422
>
> Thanks
>
> On Wed, Feb 28, 2018 at 10:38 PM, Kailash Dayanand 
> wrote:
>
> > Based on the discussion at here
> >  c2573ea63abda9dd9b8f2a261f@%3Cdev.flink.apache.org%3E>,
> > I want to propose using the latest ListShards API instead of the
> > DescribeStreams on AWS to overcome the rate limits currently imposed on
> > DescribeStream. The new List Shards have a much higher rate limits (a
> > limit of 100 transactions per second per data stream link
> >  API_ListShards.html>).
> > This was recently introduced in the aws-sdk-java release of 1.11.272
> > . I propose
> > bumping up the aws-sdk-java used in flink-kinesis connector and replace
> the
> > DescribeStream calls with ListShards in the KinesisProxy class here
> >  connectors/flink-connector-kinesis/src/main/java/org/
> apache/flink/streaming/connectors/kinesis/proxy/KinesisProxy.java>
> allowing
> > for faster shard discovery rate.
> >
> > Thanks
> > Kailash
> >
>