TypeInformation problem

2019-12-13 Thread Nicholas Walton
I was refactoring some Flink code to use IndexedSeq rather than Array. When I compiled the code I had failures that required according to the URL below the following to be inserted /* * Type information (see

Converting streaming to batch execution

2019-11-27 Thread Nicholas Walton
Hi, I’ve been working with a pipleline that was initially aimed at processing high speed sensor data, but for a proof of concept I’m feeding simulated data from a CSV file. Each row of the file is a sample across a number of time series, and I’ve been using the streaming environment to process

SQL Performance

2019-11-26 Thread Nicholas Walton
I’m streaming records down to an Embedded Derby database, at a rate of around 200 records per second. I’m certain Derby can sustain a higher throughput than that, if I could buffer the records but it seems that I’m writing each record as soon as it arrives and as a single transaction which is

Problem loading JDBC driver

2019-11-26 Thread Nicholas Walton
Hi, I have a pipeline which is sinking into an Apache Derby database, but I’m constantly receiving the error java.lang.IllegalArgumentException: JDBC driver class not found. The Scala libraries I’m loading are val flinkDependencies = Seq( "org.apache.flink" %% "flink-scala" % flinkVersion ,

Elastic search sink error handling

2019-11-19 Thread Nicholas Walton
HI, I need help with handling errors with the elasticsearch sink as below 2019-11-19 08:09:09,043 ERROR org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase - Failed Elasticsearch item request:

ElasticSearch failing when parallelised

2019-10-17 Thread Nicholas Walton
HI, I’m running ElasticSearch as a sink for a batch file processing a CSV file of 6.2 million rows, with each row being 181 numeric values. It quite happily processes a small example of around 2,000 rows, running each column through a single parallel pipeline, keyed by column number. However,

containThrowable missing in ExceptionUtils

2019-10-02 Thread Nicholas Walton
Hi, I’m trying to implement a failure handler for ElasticSearch from the example in the Flink documentation DataStream input = ...; input.addSink(new ElasticsearchSink<>( config, transportAddresses, new ElasticsearchSinkFunction() {...}, new ActionRequestFailureHandler() {

Re: Problems with java.utils

2019-09-26 Thread Nicholas Walton
che.flink.table.api.java". So you can try one of the following two > solutions: > 1) Don't import "org.apache.flink.table.api._" > 2) Use absolute import: "import _root_.java.util.ArrayList" > > Regards, > Dian > >> 在 2019年9月26日,下午10:04,Nicholas

Re: Problems with java.utils

2019-09-26 Thread Nicholas Walton
200, "http")) } will not compile, but remove the import org.apache.flink.table.api._ and all is well Nick > On 26 Sep 2019, at 12:53, Nicholas Walton wrote: > > I’m having a problem using ArrayList in Scala . The code is below > > import org.apache.flink.core.fs._ >

Problems with java.utils

2019-09-26 Thread Nicholas Walton
I’m having a problem using ArrayList in Scala . The code is below import org.apache.flink.core.fs._ import org.apache.flink.streaming.api._ import org.apache.flink.streaming.api.scala._ import org.apache.flink.table.api._ import org.apache.flink.table.api.scala._ import

Sampling rate higher than 1Khz

2019-01-28 Thread Nicholas Walton
Flinks watermarks are in milliseconds. I have time sampled off a sensor at a rate exceeding 1Khz or 1 per millisecond. Is there a way to handle timestamp granularity below milliseconds, or will I have to generate timestamp for the millisecond value preceding that associated with the sensor

Parallel stream partitions

2018-07-17 Thread Nicholas Walton
Suppose I have a data stream of tuples with the sequence of ticks being 1,2,3,…. for each separate k. I understand and keyBy(2) will partition the stream so each partition has the same key in each tuple. I now have a sequence of functions to apply to the streams say f(),g() and h() in that

Re: Parallelism and keyed streams

2018-07-17 Thread Nicholas Walton
ple4(a(i+1)._1, a(i+1)._2, v, a(i+1)._4)) } Job.LOG.debug("logRatioWindowFunction [" + a.head._1 + ", " + a.head._2 + " ... " + a.last._2 +"] collected") } } } > On 17 Jul 2018, at 00:15, Martin, Nick <mailto:nick.mar...@orbitalatk.com

Parallelism and keyed streams

2018-07-16 Thread Nicholas Walton
I have a stream of tuples , which I form into a keyedStream using keyBy on channel. I then need to process each channel in parallel. Each parallel stream must be processed in strict sequential order by index to calculate the ratios value(index)/value(index-1). If I set parallelism to 1 all is

Extending stream events with a an aggregate value

2018-06-06 Thread Nicholas Walton
I’m sure I’m being a complete idiot, since this seems so trivial but if someone could point me in the right direction I’d be very grateful. I have a simple data stream [(Int, Double)] keyed on the Int. I can calculate the running max of the stream no problem using “.max(2)”. But I want to

Re: Raspberry Pi Memory Configuration

2018-04-29 Thread Nicholas Walton
Thanks for the reply, but I tracked the problem down to a missing M in task m= anager.sh # export JVM_ARGS=3D"${JVM_ARGS} -Xms${TM_HEAP_SIZE}M -Xmx${TM_HEAP_SIZE}M -= XX:MaxDirectMemorySize=3D${TM_MAX_OFFHEAP_SIZE}" export JVM_ARGS=3D"${JVM_ARGS} -Xms${TM_HEAP_SIZE}M -Xmx${TM_HEAP_SIZE}M" I had

Raspberry Pi Memory Configuration

2018-04-28 Thread Nicholas Walton
Hi, Hope this is the right place to ask, but I have a 5 Raspberry Pi 3 cluster running Flink on which I’m hitting memory issues. Each Pi has 1Gb and is running with a 256Gb USB drive, however I don’t seem to be able to configure Java to use more than 256M of heap. The memory problem, I’m