Its
java.io.InputStream

Actually it was not very difficult to modify CSV adapter to accept stream
instead of file.
But of course, current POC may have some problems which are not shown up
yet.

CSVReader accept stream reader also.



2016-05-04 20:28 GMT+03:00 Julian Hyde <[email protected]>:

> Can you clarify what you mean by Stream? Do you mean
> java.util.Stream<String>?
>
> The Csv adapter is first and foremost a file adapter. It might be easier
> to create a stream adapter and make it parse csv than start with a file
> adapter and make it handle streams.
>
> > On May 4, 2016, at 10:07 AM, Toivo Adams <[email protected]> wrote:
> >
> > Julian,
> >
> > Thank You for Your suggestions.
> >
> > I don't want to monitor file and read appended records.
> > Initially I want to read from in-memory stream.
> > Such a stream can be very, very large and doesn't fit in memory.
> >
> > My idea is to create NiFi processor which uses SQL for data manipulation.
> > https://nifi.apache.org/
> >
> > NiFi already contains large set of processors which filter, split, route,
> > etc. different data.
> > Data can be CSV, JSON, Avro, whatever.
> >
> > Different processors use different parameters how data should be
> filtered,
> > split, routed, etc.
> > I think it would be nice to able to use SQL statement to specify how data
> > should be filtered, split, etc.
> > Because NiFi is able to use very big data sets (called FlowFile in Nifi),
> > streaming is as must.
> >
> > I created very simple of POC how to use Stream instead of File.
> > I just created new modified versions of
> > src/main/java/org/apache/calcite/adapter/csv/CsvEnumerator2.java
> > src/main/java/org/apache/calcite/adapter/csv/CsvSchema2.java
> > src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory2.java
> > src/main/java/org/apache/calcite/adapter/csv/CsvTableScan2.java
> > src/main/java/org/apache/calcite/adapter/csv/CsvTranslatableTable2.java
> >
> > CALCITE-1227
> > describes little bit different use case.
> >
> > I am ready to contribute back, but my Calcite knowledge is very limited.
> > So current POC is more like a hack and not good code.
> > Should I upload my current POC files to CALCITE-1227
> > or it is better to create another issue?
> >
> > Thanks
> > toivo
> >
> >
> > 2016-05-04 19:02 GMT+03:00 Julian Hyde <[email protected]>:
> >
> >> I’ve logged https://issues.apache.org/jira/browse/CALCITE-1227 <
> >> https://issues.apache.org/jira/browse/CALCITE-1227>. Feel free to start
> >> implementing it!
> >>
> >>> On May 4, 2016, at 8:56 AM, Julian Hyde <[email protected]> wrote:
> >>>
> >>> It’s not straightforward to re-use a table adapter as a stream adapter.
> >> The reason is that one query might want to see the past (the current
> >> contents of the table) and another query might want to see the future
> (the
> >> stream of records added from this point on).
> >>>
> >>> I’m guessing that you want something like the CSV adapter that watches
> a
> >> file and reports records added to the end of the file (like the tail
> >> command[1]).
> >>>
> >>> You’d have to change CsvTable to implement StreamableTable, and
> >> implement the ‘Table stream()’ method to return a variant of the table
> that
> >> is in “follow” mode.
> >>>
> >>> It would probably be implemented by a variant of CsvEnumerator, but it
> >> is getting its input in bursts, as the file is appended to.
> >>>
> >>> Hope that helps.
> >>>
> >>> Julian
> >>>
> >>> [1] https://en.wikipedia.org/wiki/Tail_(Unix) <
> >> https://en.wikipedia.org/wiki/Tail_(Unix)>
> >>>
> >>>> On May 2, 2016, at 3:15 AM, Toivo Adams <[email protected]
> <mailto:
> >> [email protected]>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> One possibility is to modify CsvEnumerator
> >>>> Opinions?
> >>>>
> >>>> Thanks
> >>>> Toivo
> >>>>
> >>>> 2016-05-01 18:35 GMT+03:00 Toivo Adams <[email protected]
> <mailto:
> >> [email protected]>>:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> Please help newbie.
> >>>>> CSV works well reading files, but I want to read data from stream.
> >>>>> Data is not fixed length, may be endless stream.
> >>>>>
> >>>>> Any ideas how to accomplish this?
> >>>>> Should I try to modify CsvTranslatableTable?
> >>>>> Or should I take Cassandra adapter as example?
> >>>>>
> >>>>> Initially data will be CSV but later Avro is also good candidate.
> >>>>>
> >>>>> Thanks
> >>>>> Toivo
> >>>>>
> >>>
> >>
> >>
>
>

Reply via email to