There is an discussion thread for Release Policy. https://s.apache.org/3JCm please check this thread, too.
Thanks, moon On Thu, Mar 24, 2016 at 12:02 PM Guilherme Silveira < guilhermecgss...@gmail.com> wrote: > Is there a predefined release interval, lets say, 6 months or 1 year, > between one version and another? > Em 23 de mar de 2016 4:10 PM, "Joel Van Veluwen" < > joel.vanvelu...@quantium.com.au> escreveu: > >> Hi Nikolay, >> >> >> >> I raised this with MapR and there doesn’t appear to be plans to add >> Zeppelin to 5.1 >> >> >> >> https://community.mapr.com/message/40332 >> >> >> >> We are deploying it manually and everything is pretty stable – but it >> will vary depending on your environment. >> >> >> >> Cheers, >> >> >> >> Joel Van Veluwen >> *QUANTIUM* >> Level 25, 8 Chifley >> 8-12 Chifley Square >> Sydney NSW 2000 >> >> T: +61 2 8224 8981 >> M: +61 403 153 265 >> F: +61 2 9292 6444 >> >> W: quantium.com.au <http://www.quantium.com.au> >> ------------------------------ >> >> linkedin.com/company/quantium <http://www.linkedin.com/company/quantium> >> facebook.com/QuantiumAustralia >> <http://www.facebook.com/QuantiumAustralia> >> twitter.com/QuantiumAU <http://www.twitter.com/QuantiumAU> >> >> The contents of this email, including attachments, may be confidential >> information. If you are not the intended recipient, any use, disclosure or >> copying of the information is unauthorised. If you have received this email >> in error, we would be grateful if you would notify us immediately by email >> reply, phone (+ 61 2 9292 6400) or fax (+ 61 2 9292 6444) and delete the >> message from your system. >> >> >> >> *From:* Nikolay Voronchikhin [mailto:nvoronchik...@gmail.com] >> *Sent:* Tuesday, 22 March 2016 11:39 AM >> *To:* users@zeppelin.incubator.apache.org >> *Subject:* Re: [DISCUSS] Update Roadmap >> >> >> >> Hi Zeppelin Users and Developers, >> >> >> >> Do you know if MapR will be adding Zeppelin to its roadmap for the next >> version after MapR 5.1? >> >> >> >> We see in Hue 3.9 that it provides notebooks for R Shell, Python Shell, >> PySpark, SparkR, Hive SQL, Impala SQL, and Spark SQL, but no Drill SQL >> notebook. >> >> We are looking for an Apache Project that focuses on a Drill Notebook UI >> that performs better than the Drill Web Console UI itself. >> >> >> >> Sincerely, >> >> *Nikolay Voronchikhin* >> >> *Big Data/Data Warehouse/Data Science/Data Platforms Engineer at Cisco* >> >> *https://www.linkedin.com/in/nvoronchikhin >> <https://www.linkedin.com/in/nvoronchikhin>* >> >> *E-mail: nvoronchik...@gmail.com <nvoronchik...@gmail.com>* >> >> *Mobile: 951-288-2778 <951-288-2778>* >> >> >> >> >> >> On Mon, Mar 21, 2016 at 2:44 PM, rohit choudhary <rconl...@gmail.com> >> wrote: >> >> Dear All, >> >> >> >> I think direction setting is important for Enterprise readiness. I have a >> little bit of an overview of Ambari Views, which is very similar in nature >> to Zeppelin. Please let me explain: >> >> >> >> Hive View - interacts with Hive >> >> Pig View - interacts with Pig >> >> Workflow Designer - interacts with Oozie >> >> >> >> We have a very similar architecture in Zeppelin where we interact with >> these systems through Interpreters. The usage will also be similar, as both >> with interact with Hadoop clusters or in some cases Spark with Yarn on >> HDFS. Our priorities should include: >> >> >> >> - Design & implement for multi-tenancy >> >> - Auditability from Data/State and Lineage perspective >> >> - Ability to share Notebooks/Data/State across users, preferably through >> SparkContext sharing >> >> - Security between Zeppelin and the other systems, not limited to Spark >> through Kerberos. (@Rick +1) >> >> >> >> I will share an initial draft of the thoughts I have in mind, in the next >> couple of days. >> >> >> >> Thanks, >> >> Rohit. >> >> >> >> >> >> >> >> On Thu, Mar 3, 2016 at 7:54 AM, moon soo Lee <m...@apache.org> wrote: >> >> Shabeel, thanks for the feedback about rest api and custom id. that might >> help avoid multiple rest api calls. >> >> >> >> Thanks everyone for valuable feedback. Looks like all we're going to the >> same direction. I have updated wiki. >> >> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap >> >> Please take a look. >> >> >> >> I'm sure there're many missing details in this roadmap. I must say >> something not on this roadmap doesn't mean community is not working on or >> can't be included in the Zeppelin. Roadmap represents more like community >> interest and overall direction. >> >> We're not changing roadmap everyday, but that doesn't mean roadmap is set >> in stone and never be changed. We can improve it continuously. >> >> >> >> Please feel free to fork the this mail thread for any further discussion >> on specific subject. (e.g. job scheduling) >> >> >> >> Thanks, >> >> moon >> >> >> >> On Wed, Mar 2, 2016 at 12:31 AM Shabeel Syed <shabeels...@gmail.com> >> wrote: >> >> Also we need better rest api support for creating and fetching the >> notebooks and paragraphs. >> >> for example if I can set custom defined notebookid and paragraphid , we >> can avoid multiple rest api calls. >> >> >> >> http://localhost:8080/#/notebook/ >> <notebookid>/paragraph/<paragraphid>?asIframe >> >> should return me error if notebook or paragraph deos not exists. >> >> >> >> and while creating notebook or paragraph I should be able to mention my >> custom ids. >> >> >> >> Regards >> >> Shabeel >> >> >> >> On Wed, Mar 2, 2016 at 11:55 AM, Zhong Wang <wangzhong....@gmail.com> >> wrote: >> >> +1 on @rick. quality is really important... I am still encountering bugs >> consistently >> >> >> >> On Tue, Mar 1, 2016 at 10:16 AM, TEJA SRIVASTAV <tejasrivas...@gmail.com> >> wrote: >> >> +1 on @rick >> >> >> >> On Tue, Mar 1, 2016 at 11:26 PM Benjamin Kim <bbuil...@gmail.com> wrote: >> >> I see in the Enterprise section that multi-tenancy will be included, will >> this have user impersonation too? In this way, the user executing will be >> the user owning the process. >> >> >> >> On Mar 1, 2016, at 12:51 AM, Shabeel Syed <shabeels...@gmail.com> wrote: >> >> >> >> +1 >> >> >> >> Hi Tamas, >> >> Pluggable external visualization is really a GREAT feature to have. >> I'm looking forward to this :) >> >> >> >> Regards >> >> Shabeel >> >> >> >> On Tue, Mar 1, 2016 at 2:16 PM, Tamas Szuromi <tamas.szur...@odigeo.com> >> wrote: >> >> Hey, >> >> >> >> Really promising roadmap. >> >> >> >> I'd only push more visualization options. I agree built in visualization >> is needed with limited charting options but I think we also need somehow >> 'inject' external js visualizations also. >> >> >> >> >> >> For scheduling Zeppelin notebooks we use >> https://github.com/airbnb/airflow <https://github.com/airbnb/airflow> >> through >> the job rest api. It's an enterprise ready and very robust solution >> right now. >> >> >> >> *Tamas* >> >> >> >> On 1 March 2016 at 09:12, Eran Witkon <eranwit...@gmail.com> wrote: >> >> One point to clarify, I don't want to suggest Oozie in specific, I want >> to think about which features we develop and which ones we integrate >> external, preferred Apache, technology? We don't think about building our >> own storage services so why build our own scheduler? >> Eran >> >> On Tue, 1 Mar 2016 at 09:49 moon soo Lee <m...@apache.org> wrote: >> >> @Vinayak, @Eran, @Benjamin, @Guilherme, @Sourav, @Rick >> >> Now I can see a lot of demands around enterprise level job scheduling. >> Either external or built-in, I completely agree having enterprise level job >> scheduling support on the roadmap. >> >> ZEPPELIN-137 <https://issues.apache.org/jira/browse/ZEPPELIN-137>, >> ZEPPELIN-531 <https://issues.apache.org/jira/browse/ZEPPELIN-531> are >> related issues i can find in our JIRA. >> >> >> >> @Vinayak >> >> Regarding importing notebook from github, Zeppelin has pluggable notebook >> storage layer (see related package >> <https://github.com/apache/incubator-zeppelin/tree/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/repo>). >> So, github notebook sync can be implemented easily. >> >> >> >> @Shabeel >> >> Right, we need better manage management to prevent such OOM. >> >> And i think table is one of the most frequently used way of displaying >> data. So definitely, we'll need more features like filter, sort, etc. >> >> After this roadmap discussion, discussion for the next release will >> follow. Then we'll get idea when those features will be available. >> >> >> >> @Prasad >> >> Thanks for mentioning HA and DR. They're really important subject for >> enterprise use. Definitely Zeppelin will need to address them. >> >> And displaying meta information of notebook on top level page is good >> idea. >> >> >> >> It's really great to hear many opinions and ideas. >> >> And thanks @Rick for sharing valuable view to Zeppelin project. >> >> >> >> Thanks, >> >> moon >> >> >> >> >> >> On Mon, Feb 29, 2016 at 11:14 PM Rick Moritz <rah...@gmail.com> wrote: >> >> Hi, >> >> For one, I know that there is rudimentary scheduling built into Zeppelin >> already (at least I fixed a bug in the test for a scheduling feature a few >> months ago). >> >> But another point is, that Zeppelin should also focus on quality, >> reproduceability and portability. >> >> Although this doesn't offer exciting new features, it would make >> development much easier. >> >> Cross-platform testability, Tests that pass when run sequentially, >> compatibility with Firefox, and many more open issues that make it so much >> harder to enhance Zeppelin and add features should be addressed soon, >> preferably before more features are added. Already Zeppelin is suffering - >> in my opinion - from quite a lot of feature creep, and we should avoid >> putting in the kitchen sink, at the cost of quality and maintainability. >> Instead modularity (ZEPPELIN-533 in particular) should be targeted. >> >> Oozie, in my opinion, is a dead end - it may de-facto still be in use on >> many clusters, but it's not getting the love it needs, and I wouldn't bet >> on it, when it comes to integrating scheduling. Instead, any external tool >> should be able to use the REST-API to trigger executions, if you want >> external scheduling. >> >> So, in conclusion, if we take Moon's list as a list of descending >> priorities, I fully agree, under the condition that code quality is >> included as a subset of enterprise-readyness. Auth* is paramount (Kerberos >> SPNEGO SSO support is what we really want) with user and group rights >> assignment on the notebook level. We probably also need Knox-integration >> (ODP-Members looking at integrating Zeppelin should consider contributing >> this), and integration of something like Spree ( >> https://github.com/hammerlab/spree) to be able to profile jobs. >> >> I'm hopeful that soon I can resume contributing some quality-oriented >> code, to drive this "necessary evil" forward ;) >> >> >> >> On Mon, Feb 29, 2016 at 8:27 PM, Sourav Mazumder < >> sourav.mazumde...@gmail.com> wrote: >> >> I do agree with Vinayak. It need not be coupled with Oozie. >> >> Rather one should be able to call it from any scheduler typically used in >> enterprise level. May be support for BPML. >> >> I believe the existing ability to call/execute a Zeppelin Notebook or a >> specific paragraph within a notebook using REST API should take care of >> this requirement to some extent. >> >> Regards, >> >> Sourav >> >> >> >> On Mon, Feb 29, 2016 at 11:23 AM, Vinayak Agrawal < >> vinayakagrawa...@gmail.com> wrote: >> >> @Eran Witkon, >> >> Thanks for the suggestion Eran. I concur with your thought. >> >> If Zepplin can be integrated with oozie, that would be wonderful. Users >> will also be able to leverage their Oozie skills. >> >> This would be promising for now. >> >> However, in the future Hadoop might not necessarily be installed in Spark >> Cluster and Oozie (since its installs with Hadoop Distribution) might not >> be available. >> >> So perhaps we should give a thought about this feature for the future. >> Should it depend on oozie or should Zeppelin have its owns scheduling? >> >> As Benjamin has iterated, Databrick notebook has this as a core notebook >> feature. >> >> >> >> Also, would anybody give any suggestions regarding "sync with github" >> feature? >> >> -Exporting notebook to Github >> >> -Importing notebook from Github >> >> >> >> Thanks >> >> Vinayak >> >> >> >> >> >> On Mon, Feb 29, 2016 at 4:17 AM, Eran Witkon <eranwit...@gmail.com> >> wrote: >> >> @*Vinayak Agrawal *I would suggest adding the ability to connect >> zeppelin to existing scheduling tools\workflow tools such as >> https://oozie.apache.org/. this requires betters hooks and status >> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/ >> >> >> >> >> >> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal < >> vinayakagrawa...@gmail.com> wrote: >> >> Moon, >> >> The new roadmap looks very promising. I am very happy to see security in >> the list. >> I have some suggestions regarding Enterprise Ready features: >> >> >> 1. Job Scheduler - Can this be improved? >> >> Currently the scheduler can be used with Cron expression or a pre-set >> time. But in an enterprise solution, a notebook might be one piece of the >> workflow. Can we look towards the functionality of scheduling notebook's >> based on other notebooks finishing their job successfully? >> >> This requirement would arise in any ETL workflow, where all the >> downstream users wait for the ETL notebook to finish successfully. Only >> after that, other business oriented notebooks can be executed. >> >> 2. Importing a notebook - Is there a current requirement or future plan >> to implement a feature that allows import-notebook-from-github? This would >> allow users to share notebooks seamlessly. >> >> Thanks >> >> Vinayak >> >> >> >> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee <m...@apache.org> wrote: >> >> Zhong Wang, >> >> Right, Folder support would be quite useful. Thanks for the opinion. >> >> Hope i can finish the work pr-190 >> <https://github.com/apache/incubator-zeppelin/pull/190>. >> >> >> >> Sourav, >> >> Regarding concurrent running, Zeppelin doesn't have limitation of run >> paragraph/query concurrently. Interpreter can implement it's own scheduling >> policy. For example, SparkSQL interpreter and ShellInterpreter can already >> run paragraph/query concurrently. >> >> >> >> SparkInterpreter is implemented with FIFO scheduler considering nature of >> scala compiler. That's why user can not run multiple paragraph concurrently >> when they work with SparkInterpreter. >> >> But as Zhong Wang mentioned, pr-703 enables each notebook will have >> separate scala compiler so paragraphs run concurrently, while they're in >> different notebooks. >> >> Thanks for the feedback! >> >> >> >> Best, >> >> moon >> >> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang <wangzhong....@gmail.com> >> wrote: >> >> Sourav: I think this newly merged PR can help you >> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537 >> >> >> >> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder < >> sourav.mazumde...@gmail.com> wrote: >> >> Hi Moon, >> >> This looks great. >> >> My only suggestion would be to include a PR/feature - Support for Running >> Concurrent paragraphs/queries in Zeppelin. >> >> Right now if more than one user tries to run paragraphs in multiple >> notebooks concurrently through a single Zeppelin instance (and single >> interpreter instance) the performance is very slow. It is obvious that the >> queue gets built up within the zeppelin process and interpreter process in >> that scenario as the time taken to move the status from start to pending >> and pending to running is very high compared to the actual running time of >> a paragraph. >> >> Without this the multi tenancy support would be meaningless as no one can >> practically use it in a situation where multiple users are trying to >> connect to the same instance of Zeppelin (and the related interpreter). A >> possible solution would be to spawn separate instance of the same >> interpreter at every notebook/user level. >> >> Regards, >> >> Sourav >> >> On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee <m...@apache.org> wrote: >> >> Hi Zeppelin users and developers, >> >> >> >> The roadmap we have published at >> >> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap >> >> is almost 9 month old, and it doesn't reflect where the community goes >> anymore. It's time to update. >> >> >> >> Based on mailing list, jira issues, pullrequests, feedbacks from users, >> conferences and meetings, I could summarize the major interest of users and >> developers in 7 categories. Enterprise ready, Usability improvement, >> Pluggability, Documentation, Backend integration, Notebook storage, and >> Visualization. >> >> >> >> And i could list related subjects under each categories. >> >> >> - Enterprise ready >> >> >> - Authentication >> >> >> - Shiro authentication ZEPPELIN-548 >> <https://issues.apache.org/jira/browse/ZEPPELIN-548> >> >> >> - Authorization >> >> >> - Notebook authorization PR-681 >> <https://github.com/apache/incubator-zeppelin/pull/681> >> >> >> - Security >> - Multi-tenancy >> - Stability >> >> >> - Usability Improvement >> >> >> - UX improvement >> - Better Table data support >> >> >> - Download data as csv, etc PR-725 >> <https://github.com/apache/incubator-zeppelin/pull/725>, PR-714 >> <https://github.com/apache/incubator-zeppelin/pull/714>, PR-6 >> <https://github.com/apache/incubator-zeppelin/pull/6>, PR-89 >> <https://github.com/apache/incubator-zeppelin/pull/89> >> >> >> - Featureful table data display (pagenation, etc) >> >> >> - Pluggability ZEPPELIN-533 >> <https://issues.apache.org/jira/browse/ZEPPELIN-533> >> >> >> - Pluggable visualization >> >> >> - Dynamic Interpreter, notebook, visualization loading >> >> >> - Repository and registry for pluggable components >> >> >> - Improve documentation >> >> >> - Improve contents and readability >> - more tutorials, examples >> >> >> - Interpreter >> >> >> - Generic JDBC Interpreter >> - (spark)R Interpreter >> - Cluster manager for interpreter (Proposal >> >> <https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Proposal> >> ) >> - more interpreters >> >> >> - Notebook storage >> >> >> - Versioning ZEPPELIN-540 >> <http://issues.apache.org/jira/browse/ZEPPELIN-540> >> - more notebook storages >> >> >> - Visualization >> >> >> - More visualizations PR-152 >> <https://github.com/apache/incubator-zeppelin/pull/152>, PR-728 >> <https://github.com/apache/incubator-zeppelin/pull/728>, PR-336 >> <https://github.com/apache/incubator-zeppelin/pull/336>, PR-321 >> <https://github.com/apache/incubator-zeppelin/pull/321> >> >> >> - Customize graph (show/hide label, color, etc) >> >> It will help anyone quickly get overall interest of project and the >> direction. And based on this roadmap, we can discuss and re-define the next >> release 0.6.0 scope and it's schedule. >> >> >> >> What do you think? Any feedback would be appreciated. >> >> >> >> Thanks, >> >> moon >> >> >> >> >> >> >> -- >> >> Vinayak Agrawal >> >> >> >> "To Strive, To Seek, To Find and Not to Yield!" >> >> ~Lord Alfred Tennyson >> >> >> >> >> -- >> >> Vinayak Agrawal >> >> Big Data Analytics >> >> IBM >> >> "To Strive, To Seek, To Find and Not to Yield!" >> >> ~Lord Alfred Tennyson >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >