This will need to work with YARN (Once Drill is YARN enabled, I would
expect a lot of users using it in conjunction with YARN).
Paul, I am not clear why this wouldn't work with YARN. Can you elaborate.

-Neeraja

On Mon, Jun 20, 2016 at 7:01 PM, Paul Rogers <prog...@maprtech.com> wrote:

> Good enough, as long as we document the limitation that this feature can’t
> work with YARN deployment as users generally do not have access to the
> temporary “localization” directories where the Drill code is placed by YARN.
>
> Note that the jar distribution race condition issue occurs with the
> proposed design: I believe I sketched out a scenario in one of the earlier
> comments. Drillbit A receives the CREATE FUNCTION command. It tells
> Drillbit B. While informing the other Drillbits, Drillbit B plans and
> launches a query that uses the function. Drillbit Z starts execution of the
> query before it learns from A about the new function. This will be rare —
> just rare enough to create very hard to reproduce bugs.
>
> The only reliable solution is to do the work in multiple passes:
>
> Pass 1: Ask each node to load the function, but not make it available to
> the planner. (it would be available to the execution engine.)
> Pass 2: Await confirmation from each node that this is done.
> Pass 3: Alert every node that it is now free to plan queries with the
> function.
>
> Finally, I wonder if we should design the SQL syntax based on a long-term
> design, even if the feature itself is a short-term work-around. Changing
> the syntax later might break scripts that users might write.
>
> So, the question for the group is this: is the value of semi-complete
> feature sufficient to justify the potential problems?
>
> - Paul
>
> > On Jun 20, 2016, at 6:15 PM, Parth Chandra <pchan...@maprtech.com>
> wrote:
> >
> > Moving discussion to dev.
> >
> > I believe the aim is to do a simple implementation without the complexity
> > of distributing the UDF. I think the document should make this limitation
> > clear.
> >
> > Per Paul's point on there being a simpler solution of just having each
> > drillbit detect the if a UDF is present, I think the problem is if a UDF
> > get's deployed to some but not all drillbits. A query can then start
> > executing but not run successfully. The intent of the create commands
> would
> > be to ensure that all drillbits have the UDF or none would.
> >
> > I think Jacques' point about ownership conflicts is not addressed
> clearly.
> > Also, the unloading is not clear. The delete command should probably
> remove
> > the UDF and unload it.
> >
> >
> > On Fri, Jun 17, 2016 at 11:19 AM, Paul Rogers <prog...@maprtech.com>
> wrote:
> >
> >> Reviewed the spec; many comments posted. Three primary comments for the
> >> community to consider.
> >>
> >> 1. The design conflicts with the Drill-on-YARN project. Is this a
> specific
> >> fix for one unique problem, or is it worth expanding the solution to
> work
> >> with Drill-on-YARN deployments? Might be hard to make the two work
> together
> >> later. See comments in docs for details.
> >>
> >> 2. Have we, by chance, looked at how other projects handle code
> >> distribution? Spark, Storm and others automatically deploy code across
> the
> >> cluster; no manual distribution to each node. The key difference between
> >> Drill and others is that, for Storm, say, code is associated with a job
> >> (“topology” in Storm terms.) But, in Drill, functions are global and
> have
> >> no obvious life cycle that suggests when the code can be unloaded.
> >>
> >> 3. Have considered the class loader, dependency and name space isolation
> >> issues addressed by such products as Tomcat (web apps) or Eclipse
> >> (plugins)? Putting user code in the same namespace as Drill code  is
> quick
> >> & dirty. It turns out, however, that doing so leads to problems that
> >> require long, frustrating debugging sessions to resolve.
> >>
> >> Addressing item 1 might expand scope a bit. Addressing items 2 and 3
> are a
> >> big increase in scope, so I won’t be surprised if we leave those issues
> for
> >> later. (Though, addressing item 2 might be the best way to address item
> 1.)
> >>
> >> If we want a very simple solution that requires minimal change, perhaps
> we
> >> can use an even simpler solution. In the proposed design, the user still
> >> must distribute code to all the nodes. The primary change is to tell
> Drill
> >> to load (or unload) that code. Can accomplish the same result easier
> simply
> >> by having Drill periodically scan certain directories looking for new
> (or
> >> removed) jars? Still won’t work with YARN, or solve the name space
> issues,
> >> but will work for existing non-YARN Drill users without new SQL syntax.
> >>
> >> Thanks,
> >>
> >> - Paul
> >>
> >>> On Jun 16, 2016, at 2:07 PM, Jacques Nadeau <jacq...@dremio.com>
> wrote:
> >>>
> >>> Two quick thoughts:
> >>>
> >>> - (user) In the design document I didn't see any discussion of
> >>> ownership/conflicts or unloading. Would be helpful to see the thinking
> >> there
> >>> - (dev) There is a row oriented facade via the
> >>> FieldReader/FieldWriter/ComplexWriter classes. That would be a good
> place
> >>> to start when trying to implement an alternative interface.
> >>>
> >>>
> >>> --
> >>> Jacques Nadeau
> >>> CTO and Co-Founder, Dremio
> >>>
> >>> On Thu, Jun 16, 2016 at 11:32 AM, John Omernik <j...@omernik.com>
> wrote:
> >>>
> >>>> Honestly, I don't see it as a priority issue. I think some of the
> ideas
> >>>> around community java UDFs could be a better approach. I'd hate to
> take
> >>>> away from other work to hack in something like this.
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Jun 16, 2016 at 1:19 PM, Paul Rogers <prog...@maprtech.com>
> >> wrote:
> >>>>
> >>>>> Ted refers to source code transformation. Drill gains its speed from
> >>>> value
> >>>>> vectors. However, VVs are a far cry from the row-based interface that
> >>>> most
> >>>>> mere mortals are accustomed to using. Since VVs are very type
> specific,
> >>>>> code is typically generated to handle the specifics of each type.
> >>>> Accessing
> >>>>> VVs in Jython may be a bit of a challenge because of the "impedence
> >>>>> mismatch" between how VVs work and the row-and-column view expected
> by
> >>>> most
> >>>>> (non-Drill) developers.
> >>>>>
> >>>>> I wonder if we've considered providing a row-oriented "facade" that
> can
> >>>> be
> >>>>> used by roll-your own data sources and user-defined row transforms?
> >> Might
> >>>>> be a hiccup in the fast VV pipeline, but might be handy for users
> >> willing
> >>>>> to trade a bit of speed for convenience. With such a facade, the
> Jython
> >>>> row
> >>>>> transforms that John mentions could be quite simple.
> >>>>>
> >>>>> On Thu, Jun 16, 2016 at 10:36 AM, Ted Dunning <ted.dunn...@gmail.com
> >
> >>>>> wrote:
> >>>>>
> >>>>>> Since UDF's use source code transformation, using Jython would be
> >>>>>> difficult.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jun 16, 2016 at 9:42 AM, Arina Yelchiyeva <
> >>>>>> arina.yelchiy...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Charles,
> >>>>>>>
> >>>>>>> not that I am aware of. Proposed solution doesn't invent anything
> >>>> new,
> >>>>>> just
> >>>>>>> adds possibility to add UDFs without drillbit restart. But
> >>>>> contributions
> >>>>>>> are welcomed.
> >>>>>>>
> >>>>>>> On Thu, Jun 16, 2016 at 4:52 PM Charles Givre <cgi...@gmail.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Arina,
> >>>>>>>> Has there been any discussion about making it possible via Jython
> >>>> or
> >>>>>>>> something for users to write simple UDFs in Python?
> >>>>>>>> My ideal would be to have this capability integrated in the web
> GUI
> >>>>>> such
> >>>>>>>> that a user could write their UDF (in Python) right there, submit
> >>>> it
> >>>>>> and
> >>>>>>> it
> >>>>>>>> would be deployed to Drill if it passes validation tests.
> >>>>>>>> —C
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Jun 16, 2016, at 09:34, Arina Yelchiyeva <
> >>>>>>> arina.yelchiy...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi all!
> >>>>>>>>>
> >>>>>>>>> I have created Jira to allow dynamic UDFs support in Drill (
> >>>>>>>>> https://issues.apache.org/jira/browse/DRILL-4726). There is a
> >>>> link
> >>>>>> to
> >>>>>>>>> design document in Jira description.
> >>>>>>>>> Comments or suggestions are welcomed.
> >>>>>>>>>
> >>>>>>>>> Kind regards
> >>>>>>>>> Arina
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Reply via email to