Re: Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-03 Thread Jörn Franke
import org.apache.flink.core.fs.FileSystem > On 3. Jul 2018, at 08:13, Mich Talebzadeh wrote: > > thanks Hequn. > > When I use as suggested, I am getting this error > > error] > /home/hduser/dba/bin/flink/md_streaming/src/main/scala/myPackage/md_streaming.scala:30: > not found: value

Re: Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-03 Thread Mich Talebzadeh
thanks Hequn. When I use as suggested, I am getting this error error] /home/hduser/dba/bin/flink/md_streaming/src/main/scala/myPackage/md_streaming.scala:30: not found: value FileSystem [error] .writeAsText("/tmp/md_streaming.txt", FileSystem.WriteMode.OVERWRITE) [error]

Re: Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-03 Thread Mich Talebzadeh
thanks Hequn and Jorn that helped. But I am still getting this error for a simple streaming program at execution! import java.util.Properties import java.util.Arrays import org.apache.flink.api.common.functions.MapFunction import org.apache.flink.api.java.utils.ParameterTool import

Re: MIs-reported metrics using SideOutput stream + Broadcast

2018-07-03 Thread Chesnay Schepler
Let's see if i understood everything correctly: 1) Let's say that metadata contains N records. The UI output metrics indicate that /metadata /sends N records. The UI input metrics for /join /and /late-join/ do each include N records (i.e N + whatever other data they receive). You expected

Re: The program didn't contain a Flink job

2018-07-03 Thread eSKa
We are running same job all the time. And that error is happening from time to time. Here is job submittion code: private JobSubmissionResult submitProgramToCluster(PackagedProgram packagedProgram) throws JobSubmitterException, ProgramMissingJobException,

Re: The program didn't contain a Flink job

2018-07-03 Thread Chesnay Schepler
Are you executing these jobs concurrently? The ClusterClient was not written to be used concurrently in the same JVM, as it partially relies and mutates static fields. On 03.07.2018 09:50, eSKa wrote: We are running same job all the time. And that error is happening from time to time. Here

Re: The program didn't contain a Flink job

2018-07-03 Thread eSKa
Yes - we are submitting jobs one by one. How can we change that to work for our needs? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Regarding external metastore like HIVE

2018-07-03 Thread Ashwin Sinha
By metastore we mean catalog where table information is stored. On Tue 3 Jul, 2018, 13:23 Chesnay Schepler, wrote: > What do you mean with "Metastore"? Are you referring to state backends > > ? > > On

Re: The program didn't contain a Flink job

2018-07-03 Thread Chuanlei Ni
HI, @chesnay I read the code of `ClusterClient`, and have not found the `static` field. So why cannot be used in the same jvm? (we also use `ClusterCLient` this way, so we really care about this feature) eSKa 于2018年7月3日周二 下午4:00写道: > Yes - we are submitting jobs one by one. > How can we

Re: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-03 Thread Fabian Hueske
Hi Mich, FlinkKafkaConsumer09 is the connector for Kafka 0.9.x. Have you tried to use FlinkKafkaConsumer011 instead of FlinkKafkaConsumer09? Best, Fabian 2018-07-02 22:57 GMT+02:00 Mich Talebzadeh : > This is becoming very tedious. > > As suggested I changed the kafka dependency from > >

Re: Regarding external metastore like HIVE

2018-07-03 Thread Chuanlei Ni
@ashwin I cannot find the documention about `metastore` Could you give the reference? Ashwin Sinha 于2018年7月3日周二 下午4:08写道: > By metastore we mean catalog where table information is stored. > > On Tue 3 Jul, 2018, 13:23 Chesnay Schepler, wrote: > >> What do you mean with "Metastore"? Are you

Re: The program didn't contain a Flink job

2018-07-03 Thread Chesnay Schepler
Dive into this call and you sill see that it mutates static fields in the ExecutionEnvironment. https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java#L422 On 03.07.2018 10:07, Chuanlei Ni wrote: HI, @chesnay I read the code

Re: The program didn't contain a Flink job

2018-07-03 Thread eSKa
Yes - it seems that main method returns success but for some reason we have that exception thrown. For now we applied workaround to catch exception and just skip it (later on our statusUpdater is reading statuses from FlinkDashboard). -- Sent from:

Re: Regarding external metastore like HIVE

2018-07-03 Thread Shivam Sharma
Hi, Please find the documentation link here: https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/common.html#register-an-external-catalog . Currently, Flink provides an InMemoryExternalCatalog for demo and testing > purposes. However, the ExternalCatalog interface can also be

Re: Regarding external metastore like HIVE

2018-07-03 Thread Fabian Hueske
Hi, The docs explain that the ExternalCatalog interface *can* be used to implement a catalog for HCatalog or Metastore. However, there is no such implementation in Flink yet. You would need to implement such as catalog connector yourself. I think there would be quite a few people interested in

Re: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign

2018-07-03 Thread Mich Talebzadeh
Hi Fabian. Thanks. Great contribution! It is working info] SHA-1: 98d78b909631e5d30664df6a7a4a3f421d4fd33b [info] Packaging /home/hduser/dba/bin/flink/md_streaming/target/scala-2.11/md_streaming-assembly-1.0.jar ... [info] Done packaging. [success] Total time: 14 s, completed Jul 3, 2018 9:32:25

Re: The program didn't contain a Flink job

2018-07-03 Thread Chuanlei Ni
I really interesting in making `ClusterClient` be used as multiple-instance in a jvm, because we need submit job in a long running process. I create a jira for this problem. https://issues.apache.org/jira/browse/FLINK-9710 eSKa 于2018年7月3日周二 下午4:20写道: > Yes - it seems that main method returns

Re: Dynamic Rule Evaluation in Flink

2018-07-03 Thread Chuanlei Ni
> > >1. Since the CoFlatMap function works on a single event, how do we >evaluate rules that require aggregations across events. (Match rule if more >than 5 A events happen) >2. Since the CoFlatMap function works on a single event, how do we >evaluate rules that require pattern

Re: Existing files and directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite existing files and directories.

2018-07-03 Thread Chesnay Schepler
This issue is covered in your other ML thread "/java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.assign/". Let's move further discussions there so we don't have 2 threads in parallel for the same problem. On 03.07.2018 09:21, Mich Talebzadeh wrote: thanks Hequn

Re: Regarding external metastore like HIVE

2018-07-03 Thread Chesnay Schepler
What do you mean with "Metastore"? Are you referring to state backends ? On 02.07.2018 18:41, Shivam Sharma wrote: Hi, I have read Flink documentation that Flink supports Metastore which is currently

Re: The program didn't contain a Flink job

2018-07-03 Thread Fabian Hueske
Hi, Let me summarize: 1) Sometimes you get the error message "org.apache.flink.client.program.ProgramMissingJobException: The program didn't contain a Flink job.". when submitting a program through the YarnClusterClient 2) The logs and the dashboard state that the job ran successful 3) The job

Re: Kafka Avro Table Source

2018-07-03 Thread Fabian Hueske
Hi Will, The community is currently working on improving the Kafka Avro integration for Flink SQL. There's a PR [1]. If you like, you could try it out and give some feedback. Timo (in CC) has been working Kafka Avro and should be able to help with any specific questions. Best, Fabian [1]

Re: Let BucketingSink roll file on each checkpoint

2018-07-03 Thread Fabian Hueske
Hi Xilang, Let me try to summarize your requirements. If I understood you correctly, you are not only concerned about the exactly-once guarantees but also need a consistent view of the data. The data in all files that are finalized need to originate from a prefix of the stream, i.e., all records

Re: Passing type information to JDBCAppendTableSink

2018-07-03 Thread Fabian Hueske
Hi, In addition to what Rong said: - The types look OK. - You can also use Types.STRING, and Types.LONG instead of BasicTypeInfo.xxx - Beware that in the failure case, you might have multiple entries in the database table. Some databases support an upsert syntax which (together with key or

Re: Regarding external metastore like HIVE

2018-07-03 Thread Timo Walther
Hi, you can follow the progress here: https://issues.apache.org/jira/browse/FLINK-9171 Regards, Timo Am 03.07.18 um 10:32 schrieb Fabian Hueske: Hi, The docs explain that the ExternalCatalog interface *can* be used to implement a catalog for HCatalog or Metastore. However, there is no

Re: Regarding external metastore like HIVE

2018-07-03 Thread Shivam Sharma
Thanks Timo, Fabian I will follow this. Best On Tue, Jul 3, 2018 at 3:01 PM Timo Walther wrote: > Hi, > > you can follow the progress here: > https://issues.apache.org/jira/browse/FLINK-9171 > > Regards, > Timo > > > Am 03.07.18 um 10:32 schrieb Fabian Hueske: > > Hi, > > The docs explain

Re: Trigerring Savepoint for the Flink Job

2018-07-03 Thread Anil
Sorry about the late reply. This reply is more specific to the Uber's AthenaX project . To trigger the savepoint we need to simply create an instance of YarnClusterClient. This class has implementation to trigger savepoint. To trigger the savepoint for any job

Issues with Flink1.5 SQL-Client

2018-07-03 Thread Ashwin Sinha
Hi folks, We are trying to setup flink sql client . It is still in development phase, but flink-1.5 contains beta version of this feature. *Our environment:* *Kafka-*Topic: test_flink_state_check Kafka Key:

Re: Passing type information to JDBCAppendTableSink

2018-07-03 Thread Chris Ruegger
Fabian, Rong: Thanks for the help, greatly appreciated. I am currently using a Derby database for the append-only JDBC sink. So far I don't see a way to use a JDBC/relational database solution for a retract/upsert use case? Is it possible to set up JDBC sink with Derby or MySQL so that it goes

Checkpointing in Flink 1.5.0

2018-07-03 Thread Data Engineer
The Flink documentation says that we need to specify the filesystem type (file://, hdfs://) when configuring the rocksdb backend dir. https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html#the-rocksdbstatebackend But when I do this, I get an error on job

Re: Checkpointing in Flink 1.5.0

2018-07-03 Thread Data Engineer
2018-07-03 11:30:35,703 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 2018-07-03 11:30:35,705 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting

Re: Checkpointing in Flink 1.5.0

2018-07-03 Thread Chesnay Schepler
Doesn't sound like intended behavior, can you give us the stacktrace? On 03.07.2018 13:17, Data Engineer wrote: The Flink documentation says that we need to specify the filesystem type (file://, hdfs://) when configuring the rocksdb backend dir.

Re: Checkpointing in Flink 1.5.0

2018-07-03 Thread Chesnay Schepler
Thanks. Looks like RocksDBStateBackend.setDbStoragePaths has some custom file path parsing logic, will probe it a bit to see what the issue is. On 03.07.2018 13:45, Data Engineer wrote: 2018-07-03 11:30:35,703 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -

Re: Checkpointing in Flink 1.5.0

2018-07-03 Thread Chesnay Schepler
The code appears to be working fine. This may happen because you're using a GlusterFS volume. The RocksDBStateBackend uses java Files internally (NOT nio Paths), which AFAIK only work properly against the plain local file-system. The GlusterFS nio FIleSystem implementation also explicitly

Why should we use an evictor operator in flink window

2018-07-03 Thread Congxian Qiu
Hi, all When using Flink window, I do not know why should we use an evictor operator in flink window, after reading the EvictingWindowOperator.java, I think we could do all the [evcitor] things in the userfunction. Could anyone help me to understand these? I have digged the history of

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Timo Walther
Hi Ashwin, which Flink version is your (local cluster)? Are you executing Flink in the default (new deployment) or legacy mode? The SQL client supports only the new "FLIP-6" deployment model. I'm not sure about your error message but it might be related to that. Regards, Timo Am 03.07.18

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Ashwin Sinha
Hi Timo, Our flink version is 1.5.0 We followed this documentation and started flink cluster same way. Also we are getting more logs https://pastebin.com/fGTW9s2b On Tue, Jul 3, 2018 at 7:01

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Chesnay Schepler
Can you provide us with the JobManager logs? Based on the exception i concur with Timo, it looks like the server is either running 1.4 or below, or was started in the legacy mode. On 03.07.2018 15:42, Ashwin Sinha wrote: Hi Timo, Our flink version is 1.5.0 We followed this

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Xingcan Cui
Hi Ashwin, I encountered this problem before. You should make sure that the version for your Flink cluster and the version you run the SQL-Client are exactly the same. Best, Xingcan > On Jul 3, 2018, at 10:00 PM, Chesnay Schepler wrote: > > Can you provide us with the JobManager logs? > >

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Timo Walther
Great to hear! Timo Am 03.07.18 um 16:16 schrieb Ashwin Sinha: Hi Chesnay, We got the issue. Thanks for pointing out the issue. By mistake it was running the 1.3.2 jobmanager version, and now moved to 1.5.0 and working smooth. Will be careful from next time. Thanks for all the support and

How to implement Multi-tenancy in Flink

2018-07-03 Thread Ahmad Hassan
Hi Folks, We are using Flink to capture various interactions of a customer with ECommerce store i.e. product views, orders created. We run 24 hour sliding window 5 minutes apart which makes 288 parallel windows for a single Tenant. We implement Fold Method that has various hashmaps to update the

Re: Flink kafka consumers don't honor group.id

2018-07-03 Thread Giriraj
Hi Gordon, Gordon:If I understood you correctly, what you are doing is, while a job with a Kafka consumer is already running, you want to start a new job also with a Kafka consumer as the source and uses the same group.id so that the topic's messages are routed between the two jobs. Is this

RE: Checkpointing in Flink 1.5.0

2018-07-03 Thread Jash, Shaswata (Nokia - IN/Bangalore)
Hello Chesnay, Cluster (in kubernetes)-wide checkpointing directory using glusterfs volume mount (thus file access protocol file:///) was working fine till 1.4.2 for us. So we like to understand where the breakage happened in 1.5.0. Can you please mention me the relevant source code files

Re: MIs-reported metrics using SideOutput stream + Broadcast

2018-07-03 Thread Cliff Resnick
I found the problem and, of course, it was between my desk and my chair. For side outputs Flink UI correctly reports (S+N) * Slots. Since the CoProcessFunction late-join was hashed that factored to S+N. Thanks for the help! On Tue, Jul 3, 2018 at 3:51 AM, Chesnay Schepler wrote: > Let's see if

Re: Regarding external metastore like HIVE

2018-07-03 Thread Rong Rong
+1 on this feature, there have been a lot of pains for us trying to connect to external catalog / metastore as well. @shivam can you comment on the tickets regarding the specific use case and the type of external catalogs you are interested ? Thanks, Rong On Tue, Jul 3, 2018 at 3:16 AM Shivam