Re: TikaIO Refactoring

2017-10-05 Thread Eugene Kirpichov
On Thu, Oct 5, 2017 at 10:15 AM Sergey Beryozkin wrote: > Hi Eugene > > I've done an initial commit to do with removing TikaSource, more work is > needed and I see 3 tasks remaining: > 1) provide a shortcut which can let users avoid using FileIO directly, > as you suggested

Re: spark-submit forces jackson 2.4.4

2017-10-05 Thread Jacob Marble
Finally broke through the classpath problem by "relocating" Jackson. Nothing like stepping away from from a problem for a couple days. In case anyone cares, here are some details: https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html

Re: Support for window analytic functions in SQL DSL

2017-10-05 Thread Mingmin Xu
@Kobi, Currently we don't support window analytic functions, feel free to create a new-feature JIRA ticket. On Thu, Oct 5, 2017 at 12:07 PM, Tyler Akidau wrote: > I'm not aware of analytic window support. +Mingmin Xu > or +James

Re: Support for window analytic functions in SQL DSL

2017-10-05 Thread Tyler Akidau
I'm not aware of analytic window support. +Mingmin Xu or +James could speak to any plans they might have regarding adding support. -Tyler On Mon, Oct 2, 2017 at 3:23 AM Kobi Salant wrote: > Hi, > > Calcite streaming

Re: Using side inputs in any user code via thread-local side input accessor

2017-10-05 Thread Kenneth Knowles
Sorry for the delayed reply. This may be a non-issue, but my overarching comment was to address how (if at all) this relates to the portable model of a pipeline. One easy way to avoid violating this is to wait until https://github.com/apache/beam/pull/3938 is completed. This includes a

Re: TikaIO Refactoring

2017-10-05 Thread Sergey Beryozkin
Hi Eugene I've done an initial commit to do with removing TikaSource, more work is needed and I see 3 tasks remaining: 1) provide a shortcut which can let users avoid using FileIO directly, as you suggested earlier, at the moment I do:

Re: TikaIO Refactoring

2017-10-05 Thread Sergey Beryozkin
Hi Yes, using .withCompression(UNCOMPRESSED) works, but the test code looks funny: p.apply("ParseFiles", FileIO.match().filepattern(resourcePath)) .apply(FileIO.readMatches().withCompression( compressed ? Compression.UNCOMPRESSED : Compression.AUTO))

Re: TikaIO Refactoring

2017-10-05 Thread Sergey Beryozkin
Hi Eugene On 04/10/17 22:52, Eugene Kirpichov wrote: You can avoid automatic decompression by using FileIO.readMatches().withCompression(UNCOMPRESSED) (default is AUTO). This is nice it was already thought about earlier the auto-decompression would not always be needed, and it would help a

CouchDbIO connector in beam io

2017-10-05 Thread tarush grover
Hi All, I wanted to have inputs from community members regarding to have couchdb io connectors in our beam io module. Regards, Tarush