Re: [DISCUSS] Update Roadmap

Guilherme Silveira Mon, 29 Feb 2016 10:38:28 -0800

I agree.  Jobs schedulling should be a core feature.
Em 29 de fev de 2016 12:15, "Benjamin Kim" <bbuil...@gmail.com> escreveu:


> I concur with this suggestion. In the enterprise, management would like to
> see scheduled runs to be tracked, monitored, and given SLA constraints for
> the mission critical. Alerts and notifications are crucial for DevOps to
> respond with error clarification within it. If the Zeppelin notebooks can
> be executed by a third party scheduling application, such as Oozie, then
> this requirement can be satisfied if there are no immediate plans for a
> built-in one.
>
> On Feb 29, 2016, at 1:17 AM, Eran Witkon <eranwit...@gmail.com> wrote:
>
> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
> to existing scheduling tools\workflow tools such as
> https://oozie.apache.org/. this requires betters hooks and status
> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>
>
> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
> vinayakagrawa...@gmail.com> wrote:
>
>> Moon,
>> The new roadmap looks very promising. I am very happy to see security in
>> the list.
>> I have some suggestions regarding Enterprise Ready features:
>>
>> 1. Job Scheduler - Can this be improved?
>> Currently the scheduler can be used with Cron expression or a pre-set
>> time. But in an enterprise solution, a notebook might be one piece of the
>> workflow. Can we look towards the functionality of scheduling notebook's
>> based on other notebooks finishing their job successfully?
>> This requirement would arise in any ETL workflow, where all the
>> downstream users wait for the ETL notebook to finish successfully. Only
>> after that, other business oriented notebooks can be executed.
>>
>> 2. Importing a notebook - Is there a current requirement or future plan
>> to implement a feature that allows import-notebook-from-github? This would
>> allow users to share notebooks seamlessly.
>>
>> Thanks
>> Vinayak
>>
>> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee <m...@apache.org> wrote:
>>
>>> Zhong Wang,
>>> Right, Folder support would be quite useful. Thanks for the opinion.
>>>
>> Hope i can finish the work pr-190
>>> <https://github.com/apache/incubator-zeppelin/pull/190>.
>>>
>>
>>> Sourav,
>>> Regarding concurrent running, Zeppelin doesn't have limitation of run
>>> paragraph/query concurrently. Interpreter can implement it's own scheduling
>>> policy. For example, SparkSQL interpreter and ShellInterpreter can already
>>> run paragraph/query concurrently.
>>>
>>> SparkInterpreter is implemented with FIFO scheduler considering nature
>>> of scala compiler. That's why user can not run multiple paragraph
>>> concurrently when they work with SparkInterpreter.
>>> But as Zhong Wang mentioned, pr-703 enables each notebook will have
>>> separate scala compiler so paragraphs run concurrently, while they're in
>>> different notebooks.
>>> Thanks for the feedback!
>>>
>>> Best,
>>> moon
>>>
>> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang <wangzhong....@gmail.com>
>>> wrote:
>>>
>> Sourav: I think this newly merged PR can help you
>>>> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537
>>>>
>>>> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
>>>> sourav.mazumde...@gmail.com> wrote:
>>>>
>>> Hi Moon,
>>>>>
>>>>> This looks great.
>>>>>
>>>>> My only suggestion would be to include a PR/feature - Support for
>>>>> Running Concurrent paragraphs/queries in Zeppelin.
>>>>>
>>>>> Right now if more than one user tries to run paragraphs in multiple
>>>>> notebooks concurrently through a single Zeppelin instance (and single
>>>>> interpreter instance) the performance is very slow. It is obvious that the
>>>>> queue gets built up within the zeppelin process and interpreter process in
>>>>> that scenario as the time taken to move the status from start to pending
>>>>> and pending to running is very high compared to the actual running time of
>>>>> a paragraph.
>>>>>
>>>>> Without this the multi tenancy support would be meaningless as no one
>>>>> can practically use it in a situation where multiple users are trying to
>>>>> connect to the same instance of Zeppelin (and the related interpreter). A
>>>>> possible solution would be to spawn separate instance of the same
>>>>> interpreter at every notebook/user level.
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>> On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee <m...@apache.org> wrote:
>>>>>
>>>> Hi Zeppelin users and developers,
>>>>>>
>>>>>> The roadmap we have published at
>>>>>> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
>>>>>> is almost 9 month old, and it doesn't reflect where the community
>>>>>> goes anymore. It's time to update.
>>>>>>
>>>>>> Based on mailing list, jira issues, pullrequests, feedbacks from
>>>>>> users, conferences and meetings, I could summarize the major interest of
>>>>>> users and developers in 7 categories. Enterprise ready, Usability
>>>>>> improvement, Pluggability, Documentation, Backend integration, Notebook
>>>>>> storage, and Visualization.
>>>>>>
>>>>>> And i could list related subjects under each categories.
>>>>>>
>>>>>
>>>>>>    - Enterprise ready
>>>>>>       - Authentication
>>>>>>          - Shiro authentication ZEPPELIN-548
>>>>>>          <https://issues.apache.org/jira/browse/ZEPPELIN-548>
>>>>>>       - Authorization
>>>>>>          - Notebook authorization PR-681
>>>>>>          <https://github.com/apache/incubator-zeppelin/pull/681>
>>>>>>       - Security
>>>>>>       - Multi-tenancy
>>>>>>       - Stability
>>>>>>    - Usability Improvement
>>>>>>
>>>>>>
>>>>>>    - UX improvement
>>>>>>       - Better Table data support
>>>>>>
>>>>>>
>>>>>>    - Download data as csv, etc PR-725
>>>>>>          <https://github.com/apache/incubator-zeppelin/pull/725>,
>>>>>>          PR-714
>>>>>>          <https://github.com/apache/incubator-zeppelin/pull/714>,
>>>>>>          PR-6 <https://github.com/apache/incubator-zeppelin/pull/6>,
>>>>>>          PR-89 <https://github.com/apache/incubator-zeppelin/pull/89>
>>>>>>
>>>>>>
>>>>>>    - Featureful table data display (pagenation, etc)
>>>>>>
>>>>>>
>>>>>>    - Pluggability ZEPPELIN-533
>>>>>>    <https://issues.apache.org/jira/browse/ZEPPELIN-533>
>>>>>>       - Pluggable visualization
>>>>>>
>>>>>>
>>>>>>    - Dynamic Interpreter, notebook, visualization loading
>>>>>>
>>>>>>
>>>>>>    - Repository and registry for pluggable components
>>>>>>
>>>>>>
>>>>>>    - Improve documentation
>>>>>>       - Improve contents and readability
>>>>>>       - more tutorials, examples
>>>>>>    - Interpreter
>>>>>>       - Generic JDBC Interpreter
>>>>>>       - (spark)R Interpreter
>>>>>>       - Cluster manager for interpreter (Proposal
>>>>>>       
>>>>>> <https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Proposal>
>>>>>>       )
>>>>>>       - more interpreters
>>>>>>    - Notebook storage
>>>>>>       - Versioning ZEPPELIN-540
>>>>>>       <http://issues.apache.org/jira/browse/ZEPPELIN-540>
>>>>>>       - more notebook storages
>>>>>>    - Visualization
>>>>>>
>>>>>>
>>>>>>    - More visualizations PR-152
>>>>>>       <https://github.com/apache/incubator-zeppelin/pull/152>, PR-728
>>>>>>       <https://github.com/apache/incubator-zeppelin/pull/728>, PR-336
>>>>>>       <https://github.com/apache/incubator-zeppelin/pull/336>, PR-321
>>>>>>       <https://github.com/apache/incubator-zeppelin/pull/321>
>>>>>>
>>>>>>
>>>>>>    - Customize graph (show/hide label, color, etc)
>>>>>>
>>>>>> It will help anyone quickly get overall interest of project and the
>>>>>> direction. And based on this roadmap, we can discuss and re-define the 
>>>>>> next
>>>>>> release 0.6.0 scope and it's schedule.
>>>>>>
>>>>>> What do you think? Any feedback would be appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> moon
>>>>>>
>>>>>>
>>
>>
>> --
>> Vinayak Agrawal
>>
>>
>> "To Strive, To Seek, To Find and Not to Yield!"
>> ~Lord Alfred Tennyson
>>
>
>

Re: [DISCUSS] Update Roadmap

Reply via email to