Re: Kafka segmentation

2016-11-17 Thread Hoang Bao Thien
M, Cody Koeninger > wrote: > >> > >> If you want a consistent limit on the size of batches, use > >> spark.streaming.kafka.maxRatePerPartition (assuming you're using > >> createDirectStream) > >> > >> http://spark.apache.org/docs/lates

Re: Kafka segmentation

2016-11-17 Thread Hoang Bao Thien
erPartition (assuming you're using > createDirectStream) > > http://spark.apache.org/docs/latest/configuration.html#spark-streaming > > On Thu, Nov 17, 2016 at 12:52 AM, Hoang Bao Thien > wrote: > > Hi, > > > > I use CSV and other text files to Kafka just t

Re: Kafka segmentation

2016-11-17 Thread Hoang Bao Thien
spark?) > >> > >> auto.offset.reset=largest just means that when starting the job > >> without any defined offsets, it will start at the highest (most > >> recent) available offsets. That's probably not what you want if > >> you've already loaded c