Actually, memory should not be a problem since the full data set would not be materialized in memory. Flink has a streaming runtime so most of the data would be immediately filtered out. However, reading the whole file causes of course a lot of unnecessary IO.
2016-04-26 17:09 GMT+02:00 Biplob Biswas <revolutioni...@gmail.com>: > Thanks, I was looking into the Textinputformat you suggested, and would get > back to it once I start working with huge files. I would assume there's no > workaround or additonal parameters to the readscvfile function so as to > restrict the number of lines read in one go as reading a big file would be > a > big problem in terms of memory. > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-first-operator-tp6377p6451.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >