Re: PR Milestone policy

2019-01-07 Thread Gian Merlino
My feeling is that setting a milestone on PRs before they're merged is a way of making their authors feel more included. I don't necessarily see a problem with setting milestones optimistically and then, when a release branch is about to be cut (based on the timed release date), we bulk-update

Re: Off list major development

2019-01-07 Thread Julian Hyde
Small contributions don’t need any design review, whereas large contributions need significant review. I don’t think we should require an additional step for those (many) small contributions. But who decides whether a contribution fits into the small or large category? I think the solution is

Re: Off list major development

2019-01-07 Thread Gian Merlino
It sounds like splitting design from code review is a common theme in a few of the posts here. How does everyone feel about making a point of encouraging design reviews to be done as issues, separate from the pull request, with the expectations that (1) the design review issue ("proposal") should

Re: Off list major development

2019-01-07 Thread Julian Hyde
Statically, yes, GitHub PRs are the same as GitHub cases. But dynamically, they are different, because you can only log a PR when you have finished work. A lot of other Apache projects use JIRA, so there is a clear distinction between cases and contributions. JIRA cases, especially when logged

Re: Off list major development

2019-01-07 Thread Gian Merlino
I don't think there's a need to raise issues for every change: a small bug fix or doc fix should just go straight to PR. (GitHub PRs show up as issues in the issue-search UI/API, so it's not like this means the patch has no corresponding issue -- in a sense the PR _is_ the issue.) I do think it

Re: Druid 0.14 timing

2019-01-07 Thread Benedict Jin
On 2019/01/04 21:06:40, Gian Merlino wrote: > It feels like 0.13.0 was just recently released, but it was branched off > back in October, and it has almost been 3 months since then. How do we feel > about doing an 0.14 branch cut at the end of January (Thu Jan 31) - going > back to the every

Re: Watermarks!

2019-01-07 Thread Charles Allen
I'll answer the last question first: Many data groups are processed via Airflow, so having a batch component compatible with Airflow is more impactful than being able to live stream data as it stands right now. I'm constantly on the lookout for a use case where druid streaming is a good fit for a

Re: Watermarks!

2019-01-07 Thread Gian Merlino
For Kafka, maybe something that tells you if all committed data is actually loaded, & what offset has been committed up to? Would there by any problems caused by the fact that only the most recent commit is saved in the DB? Is this feature connected at all to an ask I have heard from a few