Re: ElasticSearch and dynamic indexes

2017-11-17 Thread NerdyNick
Jira for this is https://issues.apache.org/jira/browse/BEAM-3222

Also looking at https://github.com/json-path/JsonPath for providing the
json fetching.

On Fri, Nov 17, 2017 at 11:43 AM, NerdyNick <nerdyn...@gmail.com> wrote:

> So I'm looking to expand the ElasticSearchIO writer to support dynamic
> indexes. Wanted to kick off talking about design. I'll create a Jira for it
> once I get going.
>
> Right now I'm thinking to expand the ConnectionConfiguration class to
> accept either an index (Current), dot-path string to look into the JSON
> with to pull the index value, or a user provided function that takes the
> document and returns an index. The Write class will change to use a
> MultiMap where keys are the index. The whole Multimap will be treated as
> the source for the batch sizes. I opt for that since I see the index
> usually being the same until time window endings.
>
> I'm aiming for the simplest change in design here. To not impact the work
> Chet is doing on BEAM-3201.
>
> --
> Nick Verbeck - NerdyNick
> 
> NerdyNick.com
> TrailsOffroad.com
> NoKnownBoundaries.com
>
>
>


-- 
Nick Verbeck - NerdyNick

NerdyNick.com
TrailsOffroad.com
NoKnownBoundaries.com


ElasticSearch and dynamic indexes

2017-11-17 Thread NerdyNick
So I'm looking to expand the ElasticSearchIO writer to support dynamic
indexes. Wanted to kick off talking about design. I'll create a Jira for it
once I get going.

Right now I'm thinking to expand the ConnectionConfiguration class to
accept either an index (Current), dot-path string to look into the JSON
with to pull the index value, or a user provided function that takes the
document and returns an index. The Write class will change to use a
MultiMap where keys are the index. The whole Multimap will be treated as
the source for the batch sizes. I opt for that since I see the index
usually being the same until time window endings.

I'm aiming for the simplest change in design here. To not impact the work
Chet is doing on BEAM-3201.

-- 
Nick Verbeck - NerdyNick

NerdyNick.com
TrailsOffroad.com
NoKnownBoundaries.com


Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread NerdyNick
hare and diverge otherwise (which
> seems
> >> >> it would be much simpler both to implement and explain to users). I
> >> >> would be OK with even letting the Spark 1.x runner be somewhat
> >> >> stagnant (e.g. few or no new features) until we decide we can kill it
> >> >> off.
> >> >>
> >> >> On Tue, Nov 7, 2017 at 11:27 PM, Jean-Baptiste Onofré <
> j...@nanthrax.net
> >> >
> >> >> wrote:
> >> >>
> >> >>> Hi all,
> >> >>>
> >> >>> as you might know, we are working on Spark 2.x support in the Spark
> >> >>> runner.
> >> >>>
> >> >>> I'm working on a PR about that:
> >> >>>
> >> >>> https://github.com/apache/beam/pull/3808
> >> >>>
> >> >>> Today, we have something working with both Spark 1.x and 2.x from a
> >> code
> >> >>> standpoint, but I have to deal with dependencies. It's the first
> step
> >> of
> >> >>> the
> >> >>> update as I'm still using RDD, the second step would be to support
> >> >>> dataframe
> >> >>> (but for that, I would need PCollection elements with schemas,
> that's
> >> >>> another topic on which Eugene, Reuven and I are discussing).
> >> >>>
> >> >>> However, as all major distributions now ship Spark 2.x, I don't
> think
> >> >>> it's
> >> >>> required anymore to support Spark 1.x.
> >> >>>
> >> >>> If we agree, I will update and cleanup the PR to only support and
> >> focus
> >> >>> on
> >> >>> Spark 2.x.
> >> >>>
> >> >>> So, that's why I'm calling for a vote:
> >> >>>
> >> >>>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
> >> >>>[ ] 0 (I don't care ;))
> >> >>>[ ] -1, I would like to still support Spark 1.x, and so having
> >> >>> support of
> >> >>> both Spark 1.x and 2.x (please provide specific comment)
> >> >>>
> >> >>> This vote is open for 48 hours (I have the commits ready, just
> waiting
> >> >>> the
> >> >>> end of the vote to push on the PR).
> >> >>>
> >> >>> Thanks !
> >> >>> Regards
> >> >>> JB
> >> >>> --
> >> >>> Jean-Baptiste Onofré
> >> >>> jbono...@apache.org
> >> >>> http://blog.nanthrax.net
> >> >>> Talend - http://www.talend.com
> >> >>>
> >> >>
> >> > --
> >> > Jean-Baptiste Onofré
> >> > jbono...@apache.org
> >> > http://blog.nanthrax.net
> >> > Talend - http://www.talend.com
> >> >
> >>
> >
> >
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
> >
>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
>



-- 
Nick Verbeck - NerdyNick

NerdyNick.com
TrailsOffroad.com
NoKnownBoundaries.com