Re: Thread 'SortMerger spilling thread' terminated due to an exception: No space left on device

2016-12-04 Thread Fabian Hueske
Hi Miguel,

have you found a solution to your problem?
I'm not a docker expert but this forum thread looks like could be related
to your problem [1].

Best,
Fabian

[1] https://forums.docker.com/t/no-space-left-on-device-error/10894

2016-12-02 17:43 GMT+01:00 Miguel Coimbra :

> Hello Fabian,
>
> I have created a directory on my host machine user directory (
> /home/myuser/mydir ) and I am mapping it as a volume with Docker for the
> TaskManager and JobManager containers.
> Each container will thus have the following directory /home/flink/htmp
>
> host ---> container
> /home/myuser/mydir ---> /home/flink/htmp
>
> I had previously done this successfully with the a host directory which
> holds several SNAP data sets.
> In the Flink configuration file, I specified /home/flink/htmp to be used
> as the tmp dir for the TaskManager.
> This seems to be working, as I was able to start the cluster and invoke
> Flink for that Friendster dataset.
>
> However, during execution, there were 2 intermediate files which kept
> growing until they reached about 30 GB.
> At that point, the Flink TaskManager threw the exception again:
>
> java.lang.RuntimeException: Error obtaining the sorted input: Thread
> 'SortMerger spilling thread' terminated due to an exception: No space left
> on device
>
> Here is an ls excerpt of the directory on the host (to which the
> TaskManager container was also writing successfully) shortly before the
> exception:
>
> *31G *9d177a1971322263f1597c3378885ccf.channel
> *31G* a693811249bc5f785a79d1b1b537fe93.channel
>
> Now I believe the host system is capable of storing hundred GBs more, so I
> am confused as to what the problem might be.
>
> Best regards,
>
> Miguel E. Coimbra
> Email: miguel.e.coim...@gmail.com 
> Skype: miguel.e.coimbra
>
> ​
>>
>> Hi Miguel,
>>
>> the exception does indeed indicate that the process ran out of available
>> disk space.
>> The quoted paragraph of the blog post describes the situation when you
>> receive the IOE.
>>
>> By default the systems default tmp dir is used. I don't know which folder
>> that would be in a Docker setup.
>> You can configure the temp dir using the taskmanager.tmp.dirs config key.
>> Please see the configuration documentation for details [1].
>>
>> Hope this helps,
>> Fabian
>>
>> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.1/
>> setup/config.html#jobmanager-amp-taskmanager
>>
>> 2016-12-02 0:18 GMT+01:00 Miguel Coimbra :
>> ​
>>
>>> Hello,
>>>
>>> I have a problem for which I hope someone will be able to give a hint.
>>> I am running the Flink *standalone* cluster with 2 Docker containers (1
>>> TaskManager and 1 JobManager) using 1 TaskManager with 30 GB of RAM.
>>>
>>> The dataset is a large one: SNAP Friendster, which has around 1800 M
>>> edges.
>>> https://snap.stanford.edu/data/com-Friendster.html
>>>
>>> I am trying to run the Gelly built-in label propagation algorithm on top
>>> of it.
>>> As this is a very big dataset, I believe I am exceeding the available
>>> RAM and that the system is using secondary storage, which then fails:
>>>
>>>
>>> Connected to JobManager at Actor[akka.tcp://flink@172.19.
>>> 0.2:6123/user/jobmanager#894624508]
>>> 12/01/2016 17:58:24Job execution switched to status RUNNING.
>>> 12/01/2016 17:58:24DataSource (at main(App.java:33) (
>>> org.apache.flink.api.java.io.TupleCsvInputFormat))(1/1) switched to
>>> SCHEDULED
>>> 12/01/2016 17:58:24DataSource (at main(App.java:33) (
>>> org.apache.flink.api.java.io.TupleCsvInputFormat))(1/1) switched to
>>> DEPLOYING
>>> 12/01/2016 17:58:24DataSource (at main(App.java:33) (
>>> org.apache.flink.api.java.io.TupleCsvInputFormat))(1/1) switched to
>>> RUNNING
>>> 12/01/2016 17:58:24Map (Map at fromTuple2DataSet(Graph.java:343))(1/1)
>>> switched to SCHEDULED
>>> 12/01/2016 17:58:24Map (Map at fromTuple2DataSet(Graph.java:343))(1/1)
>>> switched to DEPLOYING
>>> 12/01/2016 17:58:24Map (Map at fromTuple2DataSet(Graph.java:343))(1/1)
>>> switched to RUNNING
>>> 12/01/2016 17:59:51Map (Map at fromTuple2DataSet(Graph.java:343))(1/1)
>>> switched to FAILED
>>> *java.lang.RuntimeException: Error obtaining the sorted input: Thread
>>> 'SortMerger spilling thread' terminated due to an exception: No space left
>>> on device*
>>> at org.apache.flink.runtime.operators.sort.UnilateralSortMerger
>>> .getIterator(UnilateralSortMerger.java:619)
>>> at org.apache.flink.runtime.operators.BatchTask.getInput(BatchT
>>> ask.java:1098)
>>> at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.j
>>> ava:86)
>>> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.j
>>> ava:486)
>>> at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTas
>>> k.java:351)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:585)
>>> at java.lang.Thread.run(Thread.java:745)
>>> *Caused by: java.io.IOException: Thread 

Flink CEP dynamic patterns

2016-12-04 Thread Abdallah Ghdiri
As suggested by Matthias i am going to post my inquiry to the main thread.
Can you please take a look at my question over at stack overflow and see if
you have an answer its a quite important aspect of an ongoing project
http://stackoverflow.com/questions/40935714/is-it-possible-to-add-new-patterns-in-flink-cep-after-calling-execute


Re: microsecond resolution

2016-12-04 Thread jeff jacobson
Wow. Really? Is there a way to do micros? A hack? A Jira story? Most (all?)
U.S. equity and European futures, options, and stock markets timestamp in
microseconds. This makes Flink unusable for a massive industry vertical. To
the extent lower-frequency time-series data is being used (e.g. end of data
prices), stream processing is kind of overkill. Love everything I've read
about Flink...there's got to be a way to make this work, no?

On Sun, Dec 4, 2016 at 5:27 PM, Matthias J. Sax  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Oh. My bad... Did not read your question carefully enough.
>
> Than the answer is no, it does not support microseconds (only
> milliseconds).
>
> - -Matthias
>
>
> On 12/4/16 2:22 PM, jeff jacobson wrote:
> > Sorry if I'm missing something. That link mentions milliseconds,
> > no? My question is whether or not I can specify microseconds where
> > 1000microseconds = 1millisecond. Thanks!
> >
> > On Sun, Dec 4, 2016 at 5:05 PM, Matthias J. Sax  > > wrote:
> >
> > Yes. It does.
> >
> > See:
> > https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/strea
> ming/event_timestamps_watermarks.html#assigning-timestamps
> >
> >
>  ing/event_timestamps_watermarks.html#assigning-timestamps>
> >
> > "Both timestamps and watermarks are specified as millliseconds
> > since the Java epoch of 1970-01-01T00:00:00Z."
> >
> >
> >
> > -Matthias
> >
> >
> > On 12/04/2016 10:57 AM, jeff jacobson wrote:
> >> I've sourced stackoverflow, the docs, and the web but I can't
> >> figure out: does flink support microsecond timestamp resolution?
> >> Thanks!
> >
> >
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIYBAEBCgAGBQJYRJhUAAoJELz8Z8hxAGOiNKoP32ChGeNd7N8Zco2q6lsu+Hxd
> JZq62ey3wTrIUS+3oRlILwnu81cViQHtMMVBly3+YnqB85gNiaEUxEQTQCdKPl8G
> AqxoFIkMcrKGzwGXigKnCAoVIiyuPeNuhY1d1yv4rWrkt7qb0lCC02Xoq1C0hoS6
> Stwk62GXmNRXPYpyjnSq/iAIMbjWaU+ZU0t4V3J8loroNuJ5QcUsJLfRXeo3/5ho
> f42L+IANyB5K7vnTxNZYyf5ShNVbTY9/iFaviluxrCNztqGTo7CxMpcyWyMS3wcF
> ycXcq/daB+guEJpW0sm4JtMPSsQ/kN99c/ig3t0HX1kDV7xrDDSF2qPvbYOWF38n
> omTr7RY3YRFi5LOKvBGa96Aw5UYjMddjcqozWId6xgdXfvz6RUeJCWa9RW8I6ptg
> 8TaJpM2WgDJMgMuzdl8dDv65l78DkLlNlNo53O66b/9Pt78P75KNjj8naD5kkj4C
> i9amwnUNNEnZucA2/1vhzr6cVSzrzBLL7juVj0VmABZo4itUZjjR0UkN7MB+ioWU
> trNhaXgE6EP/160n6D0/NUu02prm3jq8mK6gu9lZFWGbAeCUcch+CbvWSaiXAw3H
> BOieCsgZD1wfXQJ3wEmnqj/YP94uDlx1IjynskDevjk6OIyIysbBSIqgsUK6fvQ8
> ztXO6ls7ARMOBmA=
> =/O+Q
> -END PGP SIGNATURE-
>


Update avro to 1.7.7 or later for flink 1.1.4

2016-12-04 Thread Torok, David
I spent close to two days and tracked down the solution to a major issue with 
Avro / GenericRecord and Flink.  In short, there is a field marked 'transient' 
in Avro 1.7.6 and earlier which interferes with correct Kryo serialization.  
This was fixed in Avro 1.7.7, but Flink is dependent on Avro 1.7.6 in its 
POM.xml file.

I recorded the root cause and solution in JIRA 
https://issues.apache.org/jira/browse/FLINK-5039

However the issue is marked as 'minor' I'd like a little more attention and 
hopefully the Avro version can be updated for Flink 1.1.4?

In the meantime, I have created a custom Flink distribution Jar containing the 
Avro 1.7.7 classfiles and it is working perfectly for me now.

Best Regards,
Dave Torok


Re: microsecond resolution

2016-12-04 Thread Abdallah Ghdiri
Help Matthias
Can you please take a look at my question over at stack overflow and see if
you have an answer
http://stackoverflow.com/questions/40935714/is-it-possible-to-add-new-patterns-in-flink-cep-after-calling-execute

Abdallah.

On Dec 4, 2016 11:27 PM, "Matthias J. Sax"  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Oh. My bad... Did not read your question carefully enough.
>
> Than the answer is no, it does not support microseconds (only
> milliseconds).
>
> - -Matthias
>
>
> On 12/4/16 2:22 PM, jeff jacobson wrote:
> > Sorry if I'm missing something. That link mentions milliseconds,
> > no? My question is whether or not I can specify microseconds where
> > 1000microseconds = 1millisecond. Thanks!
> >
> > On Sun, Dec 4, 2016 at 5:05 PM, Matthias J. Sax  > > wrote:
> >
> > Yes. It does.
> >
> > See:
> > https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/strea
> ming/event_timestamps_watermarks.html#assigning-timestamps
> >
> >
>  ing/event_timestamps_watermarks.html#assigning-timestamps>
> >
> > "Both timestamps and watermarks are specified as millliseconds
> > since the Java epoch of 1970-01-01T00:00:00Z."
> >
> >
> >
> > -Matthias
> >
> >
> > On 12/04/2016 10:57 AM, jeff jacobson wrote:
> >> I've sourced stackoverflow, the docs, and the web but I can't
> >> figure out: does flink support microsecond timestamp resolution?
> >> Thanks!
> >
> >
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIYBAEBCgAGBQJYRJhUAAoJELz8Z8hxAGOiNKoP32ChGeNd7N8Zco2q6lsu+Hxd
> JZq62ey3wTrIUS+3oRlILwnu81cViQHtMMVBly3+YnqB85gNiaEUxEQTQCdKPl8G
> AqxoFIkMcrKGzwGXigKnCAoVIiyuPeNuhY1d1yv4rWrkt7qb0lCC02Xoq1C0hoS6
> Stwk62GXmNRXPYpyjnSq/iAIMbjWaU+ZU0t4V3J8loroNuJ5QcUsJLfRXeo3/5ho
> f42L+IANyB5K7vnTxNZYyf5ShNVbTY9/iFaviluxrCNztqGTo7CxMpcyWyMS3wcF
> ycXcq/daB+guEJpW0sm4JtMPSsQ/kN99c/ig3t0HX1kDV7xrDDSF2qPvbYOWF38n
> omTr7RY3YRFi5LOKvBGa96Aw5UYjMddjcqozWId6xgdXfvz6RUeJCWa9RW8I6ptg
> 8TaJpM2WgDJMgMuzdl8dDv65l78DkLlNlNo53O66b/9Pt78P75KNjj8naD5kkj4C
> i9amwnUNNEnZucA2/1vhzr6cVSzrzBLL7juVj0VmABZo4itUZjjR0UkN7MB+ioWU
> trNhaXgE6EP/160n6D0/NUu02prm3jq8mK6gu9lZFWGbAeCUcch+CbvWSaiXAw3H
> BOieCsgZD1wfXQJ3wEmnqj/YP94uDlx1IjynskDevjk6OIyIysbBSIqgsUK6fvQ8
> ztXO6ls7ARMOBmA=
> =/O+Q
> -END PGP SIGNATURE-
>


Re: microsecond resolution

2016-12-04 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Oh. My bad... Did not read your question carefully enough.

Than the answer is no, it does not support microseconds (only
milliseconds).

- -Matthias


On 12/4/16 2:22 PM, jeff jacobson wrote:
> Sorry if I'm missing something. That link mentions milliseconds,
> no? My question is whether or not I can specify microseconds where 
> 1000microseconds = 1millisecond. Thanks!
> 
> On Sun, Dec 4, 2016 at 5:05 PM, Matthias J. Sax  > wrote:
> 
> Yes. It does.
> 
> See: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/strea
ming/event_timestamps_watermarks.html#assigning-timestamps
>
> 

> 
> "Both timestamps and watermarks are specified as millliseconds
> since the Java epoch of 1970-01-01T00:00:00Z."
> 
> 
> 
> -Matthias
> 
> 
> On 12/04/2016 10:57 AM, jeff jacobson wrote:
>> I've sourced stackoverflow, the docs, and the web but I can't
>> figure out: does flink support microsecond timestamp resolution?
>> Thanks!
> 
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIYBAEBCgAGBQJYRJhUAAoJELz8Z8hxAGOiNKoP32ChGeNd7N8Zco2q6lsu+Hxd
JZq62ey3wTrIUS+3oRlILwnu81cViQHtMMVBly3+YnqB85gNiaEUxEQTQCdKPl8G
AqxoFIkMcrKGzwGXigKnCAoVIiyuPeNuhY1d1yv4rWrkt7qb0lCC02Xoq1C0hoS6
Stwk62GXmNRXPYpyjnSq/iAIMbjWaU+ZU0t4V3J8loroNuJ5QcUsJLfRXeo3/5ho
f42L+IANyB5K7vnTxNZYyf5ShNVbTY9/iFaviluxrCNztqGTo7CxMpcyWyMS3wcF
ycXcq/daB+guEJpW0sm4JtMPSsQ/kN99c/ig3t0HX1kDV7xrDDSF2qPvbYOWF38n
omTr7RY3YRFi5LOKvBGa96Aw5UYjMddjcqozWId6xgdXfvz6RUeJCWa9RW8I6ptg
8TaJpM2WgDJMgMuzdl8dDv65l78DkLlNlNo53O66b/9Pt78P75KNjj8naD5kkj4C
i9amwnUNNEnZucA2/1vhzr6cVSzrzBLL7juVj0VmABZo4itUZjjR0UkN7MB+ioWU
trNhaXgE6EP/160n6D0/NUu02prm3jq8mK6gu9lZFWGbAeCUcch+CbvWSaiXAw3H
BOieCsgZD1wfXQJ3wEmnqj/YP94uDlx1IjynskDevjk6OIyIysbBSIqgsUK6fvQ8
ztXO6ls7ARMOBmA=
=/O+Q
-END PGP SIGNATURE-


Re: microsecond resolution

2016-12-04 Thread jeff jacobson
Sorry if I'm missing something. That link mentions milliseconds, no? My
question is whether or not I can specify microseconds where
1000microseconds = 1millisecond. Thanks!

On Sun, Dec 4, 2016 at 5:05 PM, Matthias J. Sax  wrote:

> Yes. It does.
>
> See:
> https://ci.apache.org/projects/flink/flink-docs-
> release-1.1/apis/streaming/event_timestamps_watermarks.
> html#assigning-timestamps
>
> "Both timestamps and watermarks are specified as millliseconds since the
> Java epoch of 1970-01-01T00:00:00Z."
>
>
>
> -Matthias
>
>
> On 12/04/2016 10:57 AM, jeff jacobson wrote:
> > I've sourced stackoverflow, the docs, and the web but I can't figure
> > out: does flink support microsecond timestamp resolution? Thanks!
>
>