Hi Philip,
the CsvInputFormat does not support to read empty fields.
I see two ways to achieve this functionality:
- Use a TextInputFormat that returns each line as a String and do the
parsing in a subsequent MapFunction
- Extend the CsvInputFormat to support empty fields
Cheers,
Fabian
2015-10
Thanks for your reply.
What if I do not use Table API?
The error happens when using just env.readFromCsvFile().
I heard that using RowSerializer would handle this null value, but its
error of TypeInformation happens when it is converted
On Mon, Oct 26, 2015 at 10:26 AM, Maximilian Michels wrote
As far as I know the null support was removed from the Table API because
its support was consistently supported with all operations. See
https://issues.apache.org/jira/browse/FLINK-2236
On Fri, Oct 23, 2015 at 7:18 PM, Shiti Saxena wrote:
> For a similar problem where we wanted to preserve and t
Plus, from Shiti to overcome this null value, we could use RowSerializer,
right?
I tried it in many ways, but it still did not work.
Could you take an example for it according to the previous email?
On Sat, Oct 24, 2015 at 11:19 PM, Philip Lee wrote:
> Maximilian said if we handle null value
Maximilian said if we handle null value with String, it would be acceptable.
But in fact, readCsvFile() still cannot accept null value; they said "Row
too short" in error msg.
case class WebClick(click_date: String, click_time: String, user:
String, item: String)
private def getWebClickDataSet(env
For a similar problem where we wanted to preserve and track null entries,
we load the CSV as a DataSet[Array[Object]] and then transform it into
DataSet[Row] using a custom RowSerializer(
https://gist.github.com/Shiti/d0572c089cc08654019c) which handles null.
The Table API(which supports null) can
Hi Philip,
How about making the empty field of type String? Then you can read the CSV
into a DataSet and treat the empty string as a null value. Not very nice
but a workaround. As of now, Flink deliberately doesn't support null values.
Regards,
Max
On Thu, Oct 22, 2015 at 4:30 PM, Philip Lee wr
Hi,
I am trying to load the dataset with the part of null value by using
readCsvFile().
// e.g _date|_click|_sales|_item|_web_page|_user
case class WebClick(_click_date: Long, _click_time: Long, _sales: Int,
_item: Int,_page: Int, _user: Int)
private def getWebClickDataSet(env: ExecutionEnviro