date:20180809

Re: Small-files source - partitioning based on prefix of file

2018-08-09 Thread vino yang

Hi Averell, In this case, I think you may need to extend Flink's existing source. First, read your tar.gz large file, when it been decompressed, use the multi-threaded ability to read the record in the source, and then parse the data format (map / flatmap might be a suitable operator, you can

Re: Standalone cluster instability

2018-08-09 Thread Shailesh Jain

Hi, I hit a similar issue yesterday, the task manager died suspiciously, no error logs in the task manager logs, but I see the following exceptions in the job manager logs: 2018-08-05 18:03:28,322 ERROR akka.remote.Remoting - Association to

Re: Small-files source - partitioning based on prefix of file

2018-08-09 Thread Averell

Hi Fabian, Vino, I have one more question, which I initially planned to create a new thread, but now I think it is better to ask here: I need to process one big tar.gz file which contains multiple small gz files. What is the best way to do this? I am thinking of having one single thread process

Re: Small-files source - partitioning based on prefix of file

2018-08-09 Thread Averell

Thank you Vino and Fabien for your help in answering my questions. As my files are small, I think there would not be much benefit in checkpointing file offset state. Thanks and regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink Rebalance

2018-08-09 Thread Paul Lam

Hi Antonio, AFAIK, there are two reasons for this: 1. Rebalancing itself brings latency because it takes time to redistribute the elements. 2. Rebalancing also messes up the order in the Kafka topic partitions, and often makes a event-time window wait longer to trigger in case you’re using

Re: Flink Rebalance

2018-08-09 Thread antonio saldivar

Hello Sending ~450 elements per second ( the values are in milliseconds start to end) I went from: with Rebalance *++* *| **AVGWINDOW ** |* *++* *| *32131.0853 * |* *++* to this without rebalance *++* *| **AVGWINDOW ** |* *++*

Re: Flink Rebalance

2018-08-09 Thread Elias Levy

What do you consider a lot of latency? The rebalance will require serializing / deserializing the data as it gets distributed. Depending on the complexity of your records and the efficiency of your serializers, that could have a significant impact on your performance. On Thu, Aug 9, 2018 at

Flink Rebalance

2018-08-09 Thread antonio saldivar

Hello Does anyone know why when I add "rebalance()" to my .map steps is adding a lot of latency rather than not having rebalance. I have kafka partitions in my topic 44 and 44 flink task manager execution plan looks like this when I add rebalance but it is adding a lot of latency kafka-src ->

Re: Getting compilation error in Array[TypeInformation]

2018-08-09 Thread Mich Talebzadeh

Thanks those suggestions helped Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com *Disclaimer:* Use it at

Dataset.distinct - Question on deterministic results

2018-08-09 Thread Will Bastian

I'm operating on a data set with some challenges to overcome. They are: 1. There is possibility for multiple entries for a single key and 2. For a single key, there may be multiple unique value-tuples For example key, val1, val2, val3 1, 0,0,0 1, 0,0,0 1,

Re: Getting compilation error in Array[TypeInformation]

2018-08-09 Thread Timo Walther

Hi Mich, I strongly recommend to read a good Scala programming tutorial before writing on a mailing list. As the error indicates you are missing generic parameters. If you don't know the parameter use `Array[TypeInformation[_]]` or `TableSink[_]`. For the types class you need to import the

Getting compilation error in Array[TypeInformation]

2018-08-09 Thread Mich Talebzadeh

This is the code in Scala val tableEnv = TableEnvironment.getTableEnvironment(streamExecEnv) tableEnv.registerDataStream("priceTable", splitStream, 'key, 'ticker, 'timeissued, 'price) val result = tableEnv.scan("priceTable").filter('ticker === "VOD" && 'price > 99.0).select('key,

Re: Table API, custom window

2018-08-09 Thread Fabian Hueske

Hi, regarding the plans. There are no plans to support custom window assigners and evictors. There were some thoughts about supporting different result update strategies that could be used to return early results or update results in case of late data. However, these features are currently not

Re: Table API, custom window

2018-08-09 Thread Timo Walther

Hi Oleksandr, currenlty, we don't support custom windows for Table API. The Table & SQL API try to solve the most common cases but for more specific logic we recommend the DataStream API. Regards, Timo Am 09.08.18 um 14:15 schrieb Oleksandr Nitavskyi: Hello guys, I am curious, is there a

Re: UTF-16 support for TextInputFormat

2018-08-09 Thread David Dreyfus

Hi Fabian, Thank you for taking my email. TextInputFormat.setCharsetName("UTF-16") appears to set the private variable TextInputFormat.charsetName. It doesn't appear to cause additional behavior that would help interpret UTF-16 data. The method I've tested is calling

2 node cluster reading file from ftp

2018-08-09 Thread Mohan mohan

Hi, *Context* : # Started cluster with 2 nodes node1 (Master & slave) node2 (Slave) # Source path to CsvInputFormat is like "*ftp*://test:test@x.x.x.x:5678/source.csv" and DataSet parallelism is 2 # Path to FileOutputFormat is like "*ftp*://test:test@x.x.x.x:5678/target.csv" and

Table API, custom window

2018-08-09 Thread Oleksandr Nitavskyi

Hello guys, I am curious, is there a way to define custom window (assigners/trigger/evictor) for Table/Sql Flink API? Looks like documentation keep silence about this, but is there are plans for it? Or should we go with DataStream API in case we need such kind of functionality? Thanks

Re: Heap Problem with Checkpoints

2018-08-09 Thread Piotr Nowojski

Hi, Thanks for getting back with more information. Apparently this is a known bug of JDK since 2003 and is still not resolved: https://bugs.java.com/view_bug.do?bug_id=4872014 https://bugs.java.com/view_bug.do?bug_id=6664633

Re: Window state with rocksdb backend

2018-08-09 Thread Stefan Richter

Hi, it is not quiet clear to me what your window function is doing, so sharing some (pseudo) code would be helpful. Is it ever calling a update-function for the state you are trying to modify? From the information I have it seems not the be the case and that is a wrong use of the API which

Re: Window state with rocksdb backend

2018-08-09 Thread Aljoscha Krettek

Hi, the objects you get in the WindowFunction are not supposed to be mutated. Any changes to them are not guaranteed to be synced back to the state backend. Why are you modifying in the objects? Maybe there's another way of achieving what you want to do. Best, Aljoscha > On 9. Aug 2018, at

Re: [ANNOUNCE] Apache Flink 1.6.0 released

2018-08-09 Thread vino yang

Congratulations! Great work! Till, thank you for advancing the smooth release of Flink 1.6. Vino. Till Rohrmann 于2018年8月9日周四下午7:21写道： > The Apache Flink community is very happy to announce the release of Apache > Flink 1.6.0. > > Apache Flink® is an open-source stream processing framework

[ANNOUNCE] Apache Flink 1.6.0 released

2018-08-09 Thread Till Rohrmann

The Apache Flink community is very happy to announce the release of Apache Flink 1.6.0. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for download at:

Re: Heap Problem with Checkpoints

2018-08-09 Thread Ayush Verma

Hello Piotr, I work with Fabian and have been investigating the memory leak associated with issues mentioned in this thread. I took a heap dump of our master node and noticed that there was >1gb (and growing) worth of entries in the set, /files/, in class *java.io.DeleteOnExitHook*. Almost all the

Re: JDBCInputFormat and SplitDataProperties

2018-08-09 Thread Alexis Sarda

Hi Fabian, Thanks a lot for the help. The scala DataSet, at least in version 1.5.0, declares javaSet as private[flink], so I cannot access it directly. Nevertheless, I managed to get around it by using the java environment: val env = org.apache.flink.api.java.ExecutionEnvironment.

Re: Using sensitive configuration/credentials

2018-08-09 Thread vino yang

Hi Chesnay, Oh, I did not know this feature. Any more description in Flink official documentation? Thanks, vino. Chesnay Schepler 于2018年8月9日周四下午4:29写道： > If you change the name of your configuration key ti include "secret" or > "password" it should be hidden from the logs and UI. > > On

Window state with rocksdb backend

2018-08-09 Thread 祁明良

Hi all, This is mingliang, I got a problem with rocksdb backend. I'm currently using a 15min SessionWindow which also fires every 10s, there's no pre-aggregation, so the input of WindowFunction would be the whole Iterator of input object. For window operator, I assume this collection is

Re: Using sensitive configuration/credentials

2018-08-09 Thread Chesnay Schepler

If you change the name of your configuration key ti include "secret" or "password" it should be hidden from the logs and UI. On 09.08.2018 04:28, vino yang wrote: Hi Matt, Flink is currently enhancing its security, such as the current data transmission can be configured with SSL mode[1].

Re: JDBCInputFormat and SplitDataProperties

2018-08-09 Thread Fabian Hueske

Hi Alexis, The Scala API does not expose a DataSource object but only a Scala DataSet which wraps the Java object. You can get the SplitDataProperties from the Scala DataSet as follows: val dbData: DataSet[...] = ??? val sdp = dbData.javaSet.asInstanceOf[DataSource].getSplitDataProperties So

Re: UTF-16 support for TextInputFormat

2018-08-09 Thread Fabian Hueske

Hi David, Did you try to set the encoding on the TextInputFormat with TextInputFormat tif = ... tif.setCharsetName("UTF-16"); Best, Fabian 2018-08-08 17:45 GMT+02:00 David Dreyfus : > Hello - > > It does not appear that Flink supports a charset encoding of "UTF-16". It > particular, it

Re: Could not cancel job (with savepoint) "Ask timed out"

2018-08-09 Thread vino yang

Hi Juho, We use REST client API : triggerSavepoint(), this API returns a CompletableFuture, then we call it's get() API. You can understand that I am waiting for it to complete in sync. Because cancelWithSavepoint is actually waiting for savepoint to complete synchronization, and then execute

Re: Could not cancel job (with savepoint) "Ask timed out"

2018-08-09 Thread Juho Autio

Thanks for the suggestion. Is the separate savepoint triggering async? Would you then separately poll for the savepoint's completion before executing cancel? If additional polling is needed, then I would say that for my purpose it's still easier to call cancel with savepoint and simply ignore the

Re: Small-files source - partitioning based on prefix of file

Re: Standalone cluster instability

Re: Small-files source - partitioning based on prefix of file

Re: Small-files source - partitioning based on prefix of file

Re: Flink Rebalance

Re: Flink Rebalance

Re: Flink Rebalance

Flink Rebalance

Re: Getting compilation error in Array[TypeInformation]

Dataset.distinct - Question on deterministic results

Re: Getting compilation error in Array[TypeInformation]

Getting compilation error in Array[TypeInformation]

Re: Table API, custom window

Re: Table API, custom window

Re: UTF-16 support for TextInputFormat

2 node cluster reading file from ftp

Table API, custom window

Re: Heap Problem with Checkpoints

Re: Window state with rocksdb backend

Re: Window state with rocksdb backend

Re: [ANNOUNCE] Apache Flink 1.6.0 released

[ANNOUNCE] Apache Flink 1.6.0 released

Re: Heap Problem with Checkpoints

Re: JDBCInputFormat and SplitDataProperties

Re: Using sensitive configuration/credentials

Window state with rocksdb backend

Re: Using sensitive configuration/credentials

Re: JDBCInputFormat and SplitDataProperties

Re: UTF-16 support for TextInputFormat

Re: Could not cancel job (with savepoint) "Ask timed out"

Re: Could not cancel job (with savepoint) "Ask timed out"

31 matches

Site Navigation

Mail list logo

Footer information