Can you clarify what you mean by Stream? Do you mean java.util.Stream<String>?
The Csv adapter is first and foremost a file adapter. It might be easier to create a stream adapter and make it parse csv than start with a file adapter and make it handle streams. > On May 4, 2016, at 10:07 AM, Toivo Adams <[email protected]> wrote: > > Julian, > > Thank You for Your suggestions. > > I don't want to monitor file and read appended records. > Initially I want to read from in-memory stream. > Such a stream can be very, very large and doesn't fit in memory. > > My idea is to create NiFi processor which uses SQL for data manipulation. > https://nifi.apache.org/ > > NiFi already contains large set of processors which filter, split, route, > etc. different data. > Data can be CSV, JSON, Avro, whatever. > > Different processors use different parameters how data should be filtered, > split, routed, etc. > I think it would be nice to able to use SQL statement to specify how data > should be filtered, split, etc. > Because NiFi is able to use very big data sets (called FlowFile in Nifi), > streaming is as must. > > I created very simple of POC how to use Stream instead of File. > I just created new modified versions of > src/main/java/org/apache/calcite/adapter/csv/CsvEnumerator2.java > src/main/java/org/apache/calcite/adapter/csv/CsvSchema2.java > src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory2.java > src/main/java/org/apache/calcite/adapter/csv/CsvTableScan2.java > src/main/java/org/apache/calcite/adapter/csv/CsvTranslatableTable2.java > > CALCITE-1227 > describes little bit different use case. > > I am ready to contribute back, but my Calcite knowledge is very limited. > So current POC is more like a hack and not good code. > Should I upload my current POC files to CALCITE-1227 > or it is better to create another issue? > > Thanks > toivo > > > 2016-05-04 19:02 GMT+03:00 Julian Hyde <[email protected]>: > >> I’ve logged https://issues.apache.org/jira/browse/CALCITE-1227 < >> https://issues.apache.org/jira/browse/CALCITE-1227>. Feel free to start >> implementing it! >> >>> On May 4, 2016, at 8:56 AM, Julian Hyde <[email protected]> wrote: >>> >>> It’s not straightforward to re-use a table adapter as a stream adapter. >> The reason is that one query might want to see the past (the current >> contents of the table) and another query might want to see the future (the >> stream of records added from this point on). >>> >>> I’m guessing that you want something like the CSV adapter that watches a >> file and reports records added to the end of the file (like the tail >> command[1]). >>> >>> You’d have to change CsvTable to implement StreamableTable, and >> implement the ‘Table stream()’ method to return a variant of the table that >> is in “follow” mode. >>> >>> It would probably be implemented by a variant of CsvEnumerator, but it >> is getting its input in bursts, as the file is appended to. >>> >>> Hope that helps. >>> >>> Julian >>> >>> [1] https://en.wikipedia.org/wiki/Tail_(Unix) < >> https://en.wikipedia.org/wiki/Tail_(Unix)> >>> >>>> On May 2, 2016, at 3:15 AM, Toivo Adams <[email protected] <mailto: >> [email protected]>> wrote: >>>> >>>> Hi, >>>> >>>> One possibility is to modify CsvEnumerator >>>> Opinions? >>>> >>>> Thanks >>>> Toivo >>>> >>>> 2016-05-01 18:35 GMT+03:00 Toivo Adams <[email protected] <mailto: >> [email protected]>>: >>>> >>>>> Hi, >>>>> >>>>> Please help newbie. >>>>> CSV works well reading files, but I want to read data from stream. >>>>> Data is not fixed length, may be endless stream. >>>>> >>>>> Any ideas how to accomplish this? >>>>> Should I try to modify CsvTranslatableTable? >>>>> Or should I take Cassandra adapter as example? >>>>> >>>>> Initially data will be CSV but later Avro is also good candidate. >>>>> >>>>> Thanks >>>>> Toivo >>>>> >>> >> >>
