Re: Druid 0.12.1-rc2 vote

2018-05-14 Thread Gian Merlino
+1

On Mon, May 14, 2018 at 1:43 PM, Driesprong, Fokko  wrote:

> +1 as well
>
> (Charles, your awesome email address works correctly! :)
>
> 2018-05-14 22:35 GMT+02:00 Charles Allen :
>
> > +1
> >
> > ((crosses fingers that this email sends correctly from the awesome new
> > Apache address ))
> >
> > On Mon, May 14, 2018 at 1:04 PM Prashant Deva 
> > wrote:
> >
> > > +1
> > >
> > > On Mon, May 14, 2018 at 12:58 PM Jihoon Son 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > we have no remaining issues (
> > > >
> > > >
> > > https://github.com/druid-io/druid/issues?utf8=%E2%9C%93=
> > is%3Aissue+is%3Aopen+0.12.1
> > > > )
> > > > or PRs (
> > > >
> > > >
> > > https://github.com/druid-io/druid/pulls?q=is%3Aopen+is%
> > 3Apr+milestone%3A0.12.1
> > > > )
> > > > for 0.12.1.
> > > > Let's start a vote for rc2. This is a non-ASF release.
> > > >
> > > > Here is my +1.
> > > >
> > > > Best,
> > > > Jihoon
> > > >
> > > --
> > > Prashant
> > >
> >
>


Re: Druid 0.12.1-rc1 vote

2018-04-27 Thread Gian Merlino
Hmm, that to me feels like more of a 0.13.0 thing since it's not fixing a
bug, but it's bumping up the version of a dependency. Would love to get it
merged soon though.

On Fri, Apr 27, 2018 at 2:25 AM, Hagen Rother <hagen.rot...@liquidm.com>
wrote:

> May I also kindly suggest: https://github.com/druid-io/druid/pull/5674 -
> it's failing in Travis for unknown reasons. The bug in local tests has been
> fixed.
>
> On Thu, Apr 26, 2018 at 9:57 PM, Gian Merlino <g...@apache.org> wrote:
>
> > I'm reviewing https://github.com/druid-io/druid/pull/5692 right now and
> > thinking it'd be good to include it in 0.12.1 too. It's addressing a data
> > consistency bug with Kafka indexing.
> >
> > On Wed, Apr 25, 2018 at 3:24 PM, Gian Merlino <g...@apache.org> wrote:
> >
> > > +1 on doing 0.12.1-rc1.
> > >
> > > On Tue, Apr 24, 2018 at 2:59 PM, Jihoon Son <ghoon...@gmail.com>
> wrote:
> > >
> > >> Thanks Julian, I forgot to mention that.
> > >>
> > >> Jihoon
> > >>
> > >> 2018년 4월 24일 (화) 오후 2:58, Julian Hyde <jh...@apache.org>님이 작성:
> > >>
> > >> > You should make it clear that this is a non-ASF release. Such
> releases
> > >> are
> > >> > allowed during incubation but not encouraged.
> > >> >
> > >> > Also be sure to mention it in your next report.
> > >> >
> > >> > Julian
> > >> >
> > >> >
> > >> > > On Apr 24, 2018, at 2:49 PM, Jihoon Son <jihoon...@apache.org>
> > wrote:
> > >> > >
> > >> > > Hi all,
> > >> > >
> > >> > > We currently have no open issues/PRs for 0.12.1 (
> > >> > > https://github.com/druid-io/druid/milestone/26), so I created a
> > >> branch
> > >> > for
> > >> > > 0.12.1 (https://github.com/druid-io/druid/tree/0.12.1).
> > >> > >
> > >> > > Let's vote on releasing RC1. Here is my +1.
> > >> > >
> > >> > > Best,
> > >> > > Jihoon
> > >> >
> > >> >
> > >> > 
> -
> > >> > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > >> > For additional commands, e-mail: dev-h...@druid.apache.org
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
> *Hagen Rother*
> Lead Architect | LiquidM
> --
> LiquidM Technology GmbH
> Rosenthaler Str. 36 | 10178 Berlin | Germany
> Phone: +49 176 15 00 38 77
> Internet: www.liquidm.com | LinkedIn
> <http://www.linkedin.com/company/3488199?trk=tyah;
> trkInfo=tas%3AliquidM%2Cidx%3A1-2-2>
> --
> Managing Directors | André Bräuer, Philipp Simon, Thomas Hille
> Jurisdiction | Local Court Berlin-Charlottenburg HRB 152426 B
>


Re: A question about Druid design

2018-06-19 Thread Gian Merlino
Hi Anastasia,

Sorry for the delay in getting back to you. You're right that the
PlainFactsHolder is indexed by timestamp, not by TimeAndDims -- earlier I
was answering from memory and not from actually looking at the code! Shows
what I get for doing that.

The idea with RollupFactsHolder is that at ingestion time we are doing a
group-by time (truncated based on queryGranularity) and dimensions, as
described in the "roll up" section here:
http://druid.io/docs/latest/design/index.html#roll-up. So there will be
only one row per TimeAndDims (since we're aggregating input rows using
TimeAndDims as a key). And the idea with PlainFactsHolder is that we aren't
doing any rollup at all, we're just storing one row in Druid corresponding
to one row in the input. IIRC the only reason we have a map in that case is
because we want to be able to quickly iterate the rows in time-sorted order
(query engines like timeseries depend on this ability).

On Wed, Jun 13, 2018 at 6:56 AM Anastasia Braginsky
 wrote:

>  Hi Everyone,
> Could I, please, call for your attention?The Oak project is on, and I
> would like to join the next weekly video meeting to present our
> progress.However, we are still in doubt regarding Rollup- vs Plain-
> FactsHolder. Could someone please read the email chain bellow and help with
> some answer? Or should it better be discussed in the meeting?
>
> Thanks,Anastasia
>
>
>
> On Thursday, May 31, 2018, 6:40:12 PM GMT+3, Anastasia Braginsky <
> anas...@oath.com> wrote:
>
>   Hi Gian,
> Thanks for the explanations!
> I have one more question:
>
> You say that
> "...the RollupFactsHolder there will be a _single_ fact row per
> TimeAndDims... But with the PlainFactsHolder there may be more than one
> fact row per TimeAndDims..."In PlainFactsHolder we have more than one fact
> row per Timestamp actually, or am I missing something? I mean in
> RollupFactsHolder could you scan only TimeAndDims (leading to rows) with
> some Timestamp and get the same result? Is it true that TimeAndDims are
> ordered firstly according to time anyway?
> I am most likely missing something, just would like to understand what :)
> Thanks,Anastasia
>
> On Wednesday, May 30, 2018, 10:56:26 AM GMT+3, Gian Merlino <
> gianmerl...@gmail.com> wrote:
>
>  Hi Anastasia,
>
> 1) At ingestion time the FactsHolder is sorted. The unsorted code path is
> used by groupBy v1, which hasn't been common since groupBy v2 was made the
> default a few releases ago. So I would only worry about the sorted case.
>
> 2) PlainFactsHolder is used when the user has disabled rollup at ingestion
> time. The idea is that with the RollupFactsHolder there will be a _single_
> fact row per TimeAndDims (and Druid may combine multiple input rows into
> one indexed fact row). But with the PlainFactsHolder there may be more than
> one fact row per TimeAndDims (in particular: there will be one fact row per
> input row).
>
> Hope this helps.
>
> On Wed, May 30, 2018 at 12:14 AM, Anastasia Braginsky <
> anas...@oath.com.invalid> wrote:
>
> > Hi,
> > Recall our suggestion to use the new concurrent map named Oak as a base
> > for Incremental Index. Oak stands for Off-heap Allocated Keys, for more
> > details please see issue #5698. We had a great progress with Oak
> > integration and stabilizing OakIndex performance. We have some questions
> > regarding FactsHolder. As we explained in our design document and
> > refactoring suggestion we prefer to remove the FactsHolder usage in
> > the OakIndex, because Oak maps the keys (Time) to the values
> > (Aggregators) directly. Therefore the Oak mapping is always sorted and
> only
> > from keys to values. From here we have two questions.
> >
> > 1. Unsorted FactsHolder: It is understandable that unsorted mapping via
> > HashMap (O(1) access) might be faster than sorted mapping (O(logN)
> access).
> > The question is whether the unsorted variant used frequently? When it is
> > used? And is it acceptable that in this case Oak will give slightly lower
> > performance?
> >
> > 2. Regarding Plain- vs Rollup- FactsHolder: It can be seen that
> > PlainFactsHolder is holding a queue of Key->Value (Time>Aggregator)
> > per Timestamp, where the sorting is via Timestamp. Therefore, Oak
> > implements mostly sorted RollupFactsHolder logic. Additionally, Timestamp
> > is also a part of TIme and the sorting is initially according to
> > Timestamp, then other dimensions. The question is what are the use-cases
> > where the PlainFactsHolder and not Rollup is used? And is there any
> > functionality that can be given by Plain but not by Rollup?
> >
> > Thanks,Anastasia
> >
>


Re: New Druid Meetup group for LA / Venice / Santa Monica!

2018-06-14 Thread Gian Merlino
Awesome Charles! I joined in case I'm down there :)

If you are already thinking about scheduling the ones later in the year, a
couple of us are planning on being in the LA area from November 5–7 and I
would love to attend if the schedules line up.

On Sat, Jun 9, 2018 at 12:11 PM Charles Allen 
wrote:

> Hello all,
>
> I spawned up a meetup for the LA area at
> https://www.meetup.com/druidio-la/ .
> The reason for a different meetup is so the location stuff at meetup.com
> works correctly (compared to https://www.meetup.com/druidio ). If you are
> interested in keeping in touch with other analytics lovers in the LA area
> please feel free to join!
>
> Regards,
> Charles Allen
>


Re: New Druid Meetup group for LA / Venice / Santa Monica!

2018-06-14 Thread Gian Merlino
Btw I would suggest cross posting this message to the old user list too (
https://groups.google.com/forum/#!forum/druid-user). We haven't officially
shut that one down yet and it still has way more traffic than this user
list. (We've only shut down the old dev list)

On Wed, Jun 13, 2018 at 11:53 PM Gian Merlino  wrote:

> Awesome Charles! I joined in case I'm down there :)
>
> If you are already thinking about scheduling the ones later in the year, a
> couple of us are planning on being in the LA area from November 5–7 and I
> would love to attend if the schedules line up.
>
> On Sat, Jun 9, 2018 at 12:11 PM Charles Allen 
> wrote:
>
>> Hello all,
>>
>> I spawned up a meetup for the LA area at
>> https://www.meetup.com/druidio-la/ .
>> The reason for a different meetup is so the location stuff at meetup.com
>> works correctly (compared to https://www.meetup.com/druidio ). If you are
>> interested in keeping in touch with other analytics lovers in the LA area
>> please feel free to join!
>>
>> Regards,
>> Charles Allen
>>
>


Re: Transactions in Druid?

2018-05-29 Thread Gian Merlino
Hi Edward,

There are a couple of ways to do transactional updates to Druid today:

1) When using streaming ingestion with the "parseBatch" feature
(see io.druid.data.input.impl.InputRowParser), rows in the batch are
inserted transactionally.
2) When using batch ingestion in overwrite mode (the default), all
operations are transactional. Of course, you must be willing to overwrite
an entire time interval in this mode.

With (1) scans are not transactionally consistent.

With (2) scans of a particular interval _are_ transactionally consistent,
due to the nature of how we handle overwrite-style ingestion (the new
segment set has a higher version number, and queries will use either the
older or newer version).

However Druid never offers read-your-writes consistency. There is always
some delay (however small) between when you trigger an insert and when that
insert is actually readable.

On Tue, May 29, 2018 at 4:38 PM, Edward Bortnikov  wrote:

> Hi, Community,
> Do we have any existing or perceived use cases of transactional
> (multi-row) updates to Druid? Same about transactionally consistent scans?
> Thanks, Edward


Re: A question about Druid design

2018-05-30 Thread Gian Merlino
Hi Anastasia,

1) At ingestion time the FactsHolder is sorted. The unsorted code path is
used by groupBy v1, which hasn't been common since groupBy v2 was made the
default a few releases ago. So I would only worry about the sorted case.

2) PlainFactsHolder is used when the user has disabled rollup at ingestion
time. The idea is that with the RollupFactsHolder there will be a _single_
fact row per TimeAndDims (and Druid may combine multiple input rows into
one indexed fact row). But with the PlainFactsHolder there may be more than
one fact row per TimeAndDims (in particular: there will be one fact row per
input row).

Hope this helps.

On Wed, May 30, 2018 at 12:14 AM, Anastasia Braginsky <
anas...@oath.com.invalid> wrote:

> Hi,
> Recall our suggestion to use the new concurrent map named Oak as a base
> for Incremental Index. Oak stands for Off-heap Allocated Keys, for more
> details please see issue #5698. We had a great progress with Oak
> integration and stabilizing OakIndex performance. We have some questions
> regarding FactsHolder. As we explained in our design document and
> refactoring suggestion we prefer to remove the FactsHolder usage in
> the OakIndex, because Oak maps the keys (Time) to the values
> (Aggregators) directly. Therefore the Oak mapping is always sorted and only
> from keys to values. From here we have two questions.
>
> 1. Unsorted FactsHolder: It is understandable that unsorted mapping via
> HashMap (O(1) access) might be faster than sorted mapping (O(logN) access).
> The question is whether the unsorted variant used frequently? When it is
> used? And is it acceptable that in this case Oak will give slightly lower
> performance?
>
> 2. Regarding Plain- vs Rollup- FactsHolder: It can be seen that
> PlainFactsHolder is holding a queue of Key->Value (Time>Aggregator)
> per Timestamp, where the sorting is via Timestamp. Therefore, Oak
> implements mostly sorted RollupFactsHolder logic. Additionally, Timestamp
> is also a part of TIme and the sorting is initially according to
> Timestamp, then other dimensions. The question is what are the use-cases
> where the PlainFactsHolder and not Rollup is used? And is there any
> functionality that can be given by Plain but not by Rollup?
>
> Thanks,Anastasia
>


Re: Access to jira

2018-05-31 Thread Gian Merlino
We should probably have a label for it too.

On Thu, May 31, 2018 at 9:23 AM, Gian Merlino  wrote:

> I don't see why not!
>
> On Thu, May 31, 2018 at 9:21 AM, Charles Allen  wrote:
>
>> Sounds good. I'd like to put some more formal tracking and responsibility
>> to the remaining incubator items. Would github issues be the preferred
>> place to do that?
>>
>> On Thu, May 31, 2018 at 9:20 AM Gian Merlino 
>> wrote:
>>
>> > I think we are planning to keep using GitHub issues, based on the
>> > discussion in the migration logistics thread. And based on the fact that
>> > Apache seems to allow that now (https://github.com/apache/fluo was
>> given
>> > as
>> > an example). So probably the right thing to do is update
>> > http://incubator.apache.org/projects/druid.html accordingly?
>> >
>> > On Thu, May 31, 2018 at 9:15 AM, Charles Allen 
>> wrote:
>> >
>> > > Hi all
>> > >
>> > > http://incubator.apache.org/projects/druid.html says that
>> > > https://issues.apache.org/jira/browse/DRUID is our issue tracker,
>> but I
>> > > don't seem to have access to it. Does anyone know how to apply for
>> access
>> > > using an existing Apache JIRA login?
>> > >
>> > > Thanks,
>> > > Charles Allen
>> > >
>> >
>>
>
>


Re: Access to jira

2018-05-31 Thread Gian Merlino
I don't see why not!

On Thu, May 31, 2018 at 9:21 AM, Charles Allen  wrote:

> Sounds good. I'd like to put some more formal tracking and responsibility
> to the remaining incubator items. Would github issues be the preferred
> place to do that?
>
> On Thu, May 31, 2018 at 9:20 AM Gian Merlino 
> wrote:
>
> > I think we are planning to keep using GitHub issues, based on the
> > discussion in the migration logistics thread. And based on the fact that
> > Apache seems to allow that now (https://github.com/apache/fluo was given
> > as
> > an example). So probably the right thing to do is update
> > http://incubator.apache.org/projects/druid.html accordingly?
> >
> > On Thu, May 31, 2018 at 9:15 AM, Charles Allen 
> wrote:
> >
> > > Hi all
> > >
> > > http://incubator.apache.org/projects/druid.html says that
> > > https://issues.apache.org/jira/browse/DRUID is our issue tracker, but
> I
> > > don't seem to have access to it. Does anyone know how to apply for
> access
> > > using an existing Apache JIRA login?
> > >
> > > Thanks,
> > > Charles Allen
> > >
> >
>


Re: Access to jira

2018-05-31 Thread Gian Merlino
I think we are planning to keep using GitHub issues, based on the
discussion in the migration logistics thread. And based on the fact that
Apache seems to allow that now (https://github.com/apache/fluo was given as
an example). So probably the right thing to do is update
http://incubator.apache.org/projects/druid.html accordingly?

On Thu, May 31, 2018 at 9:15 AM, Charles Allen  wrote:

> Hi all
>
> http://incubator.apache.org/projects/druid.html says that
> https://issues.apache.org/jira/browse/DRUID is our issue tracker, but I
> don't seem to have access to it. Does anyone know how to apply for access
> using an existing Apache JIRA login?
>
> Thanks,
> Charles Allen
>


Re: CLA still required?

2018-06-01 Thread Gian Merlino
Yes we are still collecting them, although once we are fully migrated to
ASF, then we won't anymore (as per ASF policy - as I understand it - CLAs
are only required for committers).

On Fri, Jun 1, 2018 at 6:27 AM, Pierre Lacave  wrote:

> Hi,
>
> With the incubation ongoing, do you still require CLA signed for
> contributions?
>
> Thanks
>


Re: Podling Report Reminder - June 2018

2018-06-20 Thread Gian Merlino
Hi Justin,

As Julian said, he is active, and has been very helpful as a mentor.
(Thanks Julian!!)

Thanks for checking up on us. We'll make sure not to miss next month's
report; we are aware they are a thing (we did a couple already -- Nishant
did one and I did one or two) but it looks like nobody noticed this one
until it was too late. About the non-ASF releases: I don't have a link
handy to the thread, but basically what is going on is that until recently
we didn't have the SGA sorted, which we understood to mean it was out of
the question to do ASF releases. Now it is sorted and we're looking to
migrate our git repos next. I think it is likely we'll want to do another
non-incubator minor release while that process is ongoing, but am still
optimistic that the next major release will be an incubator release.

On Tue, Jun 19, 2018 at 6:00 PM Justin Mclean 
wrote:

> Hi,
>
> I noticed the podling has failed to report this month and so will need two
> report next month. You're a relatively new podling so I wonder are you just
> not aware of the process or is it that you need more help from your
> mentors? Are your mentors currently active?
>
> I also notice on your dev list a number of votes for non ASF releases. I
> can’t see (but may of missed it) any discussion to why these non ASF
> releases are needed or what is holding you up making an ASF release. It
> seems a little odd to me that you would vote on these and then not get the
> IPMC to look at them, so it may be good idea to have the IPMC look at these
> releases in the future, but hopefully your next release will be an ASF one
> and will be voted on by the IPMC.
>
> Thanks,
> Justin
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: Druid repo migration plan

2018-06-20 Thread Gian Merlino
+1; thanks Jon!

On Wed, Jun 20, 2018 at 5:52 PM Jihoon Son  wrote:

> +1
>
> Sounds good to me.
>
> Jihoon
>
> On Wed, Jun 20, 2018 at 5:12 PM Nishant Bangarwa <
> nbanga...@hortonworks.com>
> wrote:
>
> > +1
> >
> > --
> > Nishant Bangarwa
> >
> > Hortonworks
> >
> > On 6/20/18, 3:57 PM, "Jonathan Wei"  wrote:
> >
> > Hi all,
> >
> > The SGA for Druid has been sorted out, we can get started on
> migrating
> > the
> > old Github repo to Apache.
> >
> > Based on the discussion in our previous migration thread (
> >
> >
> https://groups.google.com/forum/#!msg/druid-development/q1ip-L8xpBk/GPK1LhC7BQAJ
> > ),
> > it seems we favor using our existing Github PR and issues workflows.
> >
> > I'll file a JIRA ticket requesting transfer of
> > https://github.com/druid-io/druid to a Gitbox-style Apache repo,
> > keeping
> > the existing history of PRs/issues/stars/etc. (e.g., Superset:
> > https://github.com/apache/incubator-superset)
> >
> > Before I do that, I wanted to open this thread for a vote to confirm
> > that
> > we're all okay with this plan, so please chime in with an approval or
> > any
> > concerns that you may have.
> >
> > Thanks,
> > Jon
> >
> >
> >
>


Re: Considering 0.12.2 release

2018-06-20 Thread Gian Merlino
+1

I would suggest adding these two as well.

https://github.com/druid-io/druid/pull/5878 - Fix inefficient available
segment cache population in SQLMetadataSegmentManager.
https://github.com/druid-io/druid/pull/5873 - HdfsDataSegmentPusher: Close
tmpIndexFile before copying it.

On Tue, Jun 12, 2018 at 11:34 AM Prashant Deva 
wrote:

> +1
>
> On Fri, Jun 8, 2018 at 3:37 PM Jihoon Son  wrote:
>
> > Hi guys,
> >
> > we have a couple of bug fix PRs available and some of them fix regression
> > bugs.
> >
> > Here is the list of currently available bug fix PRs which are not
> included
> > in 0.12.1.
> >
> > Regression bug fixes
> > - https://github.com/druid-io/druid/pull/5858
> > - https://github.com/druid-io/druid/pull/5805
> > - https://github.com/druid-io/druid/pull/5807
> >
> > Non-regression bug fixes
> > - https://github.com/druid-io/druid/pull/5850
> > - https://github.com/druid-io/druid/pull/5815
> > - https://github.com/druid-io/druid/pull/5856
> >
> > I think it's worth to make another release for users.
> >
> > Welcome any idea.
> >
> > Jihoon
> >
> --
> Prashant
>


Re: Druid Security / Segment Encryption

2018-07-02 Thread Gian Merlino
Hi Ben,

Druid's security features today consist of an authentication/authorization
layer, and the ability to use TLS. To my knowledge encrypting the data
files at rest has not been looked into yet. In the past when I've been
asked, I've suggested using disk encryption, and people usually seem happy
with that. But it sounds like you have more strict requirements.

Since Druid's segment format is column oriented, you could imagine each
column being encrypted with its own key. Possibly the same system that
handles compression could handle encryption too (we compress columns in
chunks of a few thousand rows each). I'm not enough of an encryption expert
to know if that's the right way to go, but it would be a possibility.

On Mon, Jul 2, 2018 at 4:42 PM Ben DeMott  wrote:

> Was wondering if anyone had worked on, considered, or thought about
> security or privacy in Druid.
> Where I work has extremely strict requirements on storing some types of
> client data.
> Ideally we would encrypt data per-client in such a way where querying
> segments requires an encryption key based upon a given dimension of the
> data (client).
>
> Has anyone worked on this, or homomorphic encrpytion in Druid?
>
> Thanks,
> Ben
>


Re: [druid-user] Druid 12.1 Datasource load fails in Coordinator due change in implementation from Set to Map

2018-07-03 Thread Gian Merlino
Yes, this should be out soon. We regret the regression! If you are
comfortable patching #5878 into your build, that should fix it. It will
also be included in 0.12.2.

Gian

On Thu, Jun 28, 2018 at 10:35 AM, Samarth Jain 
wrote:

> Adding the dev email group.
>
> We are currently hitting this problem in our environment too where loading
> 200K segments is taking forever where as on 10.1 the load happened in less
> than 5 minutes. I see a pull request (https://github.com/druid-io/
> druid/pull/5878) that potentially fixes this issue that was checked in to
> master. I believe this fix would be part of the 0.12.2 release whenever it
> comes out.
>
>
>
>
> On Thu, Jun 28, 2018 at 1:50 AM, Venu Reddy 
> wrote:
>
>> Hi Team,
>>
>> We have close to ~500,000 active data segment in the Metadata store
>> (Postgres)
>> Coordinator is running on a 4 CPU, Centos server
>>
>> We have updated from 0.10.0 to Druid 0.12.1, Post this when we bring up
>> the Co-ordinator we see the below behaviour
>>
>> The datasource loading keeps running and goes into a hung state in inside
>> the poll() function in *SQLMetadataSegmentManager.java*.
>> On further debugging we see that below portion is the one that is taking
>> time
>>
>> if (!dataSource.getSegments().contains(segment)) {
>>   dataSource.addSegment(segment);
>> }
>>
>> And it seems like the main reason it is taking time is due to the change
>> in the file *DruidDataSource.java* from *ConcurrentSkipListSet* (and a
>> HashMap) to *ConcurrentSkipListMap*
>>
>> We added additional logging statements to time the above section in the
>> *SQLMetadataSegmentManager.java *and we see that as the loop runs
>> collecting segments, initially the time taken is less than a milli second
>> but as the loop runs inserting more records into the
>> *ConcurrentSkipListMap,* the insertions take ~8 ms by ~50k records and
>> then increase all the way to ~300 ms when we reach 300K records
>>
>> We also added the same timers to the *lower version* of Druid and with
>> *ConcurrentSkipListSet *the implementation the loop completes processing
>> the 500k records in 5 mins.
>> Also when we try with a higher config machine 32 CPU, we still see the
>> same behaviour.
>>
>> In Summary it seems like *ConcurrentSkipListMap* is slower than
>> *ConcurrentSkipListSet* and is resulting in some sort of timeout in
>> version 0.12.1 whereas the same number of segments are getting loaded
>> without issues in under 10 mins in version 0.10.0. Also when we check the
>> code, the code in 0.11.0 seems identical to 0.10.0 however the 0.12.1 has
>> this change.
>>
>> Regards,
>> Venu
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Druid User" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to druid-user+unsubscr...@googlegroups.com.
>> To post to this group, send email to druid-u...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-user/50a75795-57af-455c-955b-7153379b9253%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-user+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-user/CAMfSBKJH46KBnKhLsTU2n_YF8Nm1LE0HCJxkP5neMRT8aJFTug%
> 40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


Re: Considering 0.12.2 release

2018-06-29 Thread Gian Merlino
Yeah, that one is definitely a good idea to include.

On Thu, Jun 28, 2018 at 11:32 AM Samarth Jain 
wrote:

> Never mind, I see Gian has already suggested to include the fix in
> https://github.com/druid-io/druid/pull/5878.
>
> On Thu, Jun 28, 2018 at 11:30 AM, Samarth Jain 
> wrote:
>
> > How about https://github.com/druid-io/druid/pull/5878? It looks like a
> > pretty serious regression too, IMHO.
> >
> > On Tue, Jun 26, 2018 at 11:05 AM, Jihoon Son 
> wrote:
> >
> >> Thanks guys.
> >>
> >> I'll try to finish 5729 and 5745 in this week and then start the release
> >> process for 0.12.2.
> >>
> >> Jihoon
> >>
> >> On Tue, Jun 26, 2018 at 10:42 AM Prashant Deva  >
> >> wrote:
> >>
> >> > Jihoon,
> >> >  +1 for adding those kafka indexing bugs to 0.12.2
> >> >
> >> > Prashant
> >> >
> >> >
> >> > On Mon, Jun 25, 2018 at 1:54 PM Gian Merlino  wrote:
> >> >
> >> > > I am in favor of including these in 0.12.2 as we work towards
> >> > robustifying
> >> > > Kafka indexing.
> >> > >
> >> > > On Mon, Jun 25, 2018 at 11:15 AM Jihoon Son 
> >> > wrote:
> >> > >
> >> > > > Hi guys,
> >> > > >
> >> > > > some bugs recently have been reported for Kafka indexing service,
> >> and
> >> > > they
> >> > > > look worthwhile to fix in our next release as well.
> >> > > > Here are some issues related.
> >> > > >
> >> > > > - Broken contract of Appenderator in KafkaIndexTask (
> >> > > > https://github.com/druid-io/druid/issues/5729)
> >> > > > - Kafka Indexing Service task pause forever until timeout if no
> >> events
> >> > in
> >> > > > taskduration? (https://github.com/druid-io/druid/issues/5656, PR
> >> > > availabe)
> >> > > > - ConcurrentModificationException in KafkaIndexTask (
> >> > > > https://github.com/druid-io/druid/issues/5745)
> >> > > > - Kafka: Reordered segment allocation causes spurious failures (
> >> > > > https://github.com/druid-io/druid/issues/5761, PR available)
> >> > > > - KafkaSupervisor NPE in checkPendingCompletionTasks when a group
> >> times
> >> > > out
> >> > > > (https://github.com/druid-io/druid/issues/5900)
> >> > > >
> >> > > > I'm currently working on https://github.com/druid-io/dr
> >> uid/issues/5729
> >> > > and
> >> > > > https://github.com/druid-io/druid/issues/5745, and PRs will be
> >> ready
> >> > in
> >> > > > this week.
> >> > > >
> >> > > > What do you guys think about including fixes for these bugs in
> >> 0.12.2?
> >> > > >
> >> > > > Jihoon
> >> > > >
> >> > > > On Mon, Jun 25, 2018 at 10:52 AM Prashant Deva <
> >> > prashant.d...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > +1 for 0.12.2.
> >> > > > >
> >> > > > > We would rather have those fixes now than wait till 0.13.
> >> > > > >
> >> > > > > Prashant
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Jun 20, 2018 at 11:55 PM Gian Merlino 
> >> > wrote:
> >> > > > >
> >> > > > > > I think we're hoping that the next major (0.13) will also be
> our
> >> > > first
> >> > > > > > Apache release. If it ends up taking too long we might want to
> >> > > rethink
> >> > > > > that
> >> > > > > > but hopefully it won't (see the recently started thread about
> >> the
> >> > > next
> >> > > > > > step, migrating the Github repos).
> >> > > > > >
> >> > > > > > On Wed, Jun 20, 2018 at 10:42 PM Roman Leventov <
> >> > > leventov...@gmail.com
> >> > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Why won't release 13.0 instead (as non-Apache)?
> >> > > > > > >
> >>

Re: Podling Report Reminder - May 2018

2018-05-03 Thread Gian Merlino
Here's a draft (below). Since it's already overdue, I'll post it on the
wiki in a few hours.

Druid

Druid is a high-performance, column-oriented, distributed data store.

Druid has been incubating since 2018-02-28.

Three most important issues to address in the move towards graduation:

 1. Complete SGA for current sources and ICLAs for current committers.
 2. Move the source code and website to Apache infrastructure.
 3. Plan and execute our first Apache release.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

- None.

How has the community developed since the last report?

- We have disabled posting on our pre-Apache development list.
- We have updated our pre-Apache community page (http://druid.io/community/)
to
  include information about incubation.
- We have set up a placeholder site on https://druid.apache.org/.
- A healthy, constant flow of bug fixes, quality improvements and new
features
  are still ongoing on https://github.com/druid-io/druid.

How has the project developed since the last report?

- Since the last report there have been 69 commits from 23 individuals.
- We have conducted a vote to put out a release candidate 0.12.1-rc1. This
  release candidate is being done outside the Incubator. We also anticipate
the
  0.12.1 release to be done outside the Incubator.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [X] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [ ] Nearing graduation
  [ ] Other:

Date of last release:

- Druid 0.12.0 on 2018-03-06 (non-Apache release)
- No official Apache release yet since beginning Apache Incubation

When were the last committers or PPMC members elected?

- Project is still functioning with the initial set of committers.

Signed-off-by:

  [ ](druid) Julian Hyde
  [ ](druid) P. Taylor Goetz
  [ ](druid) Jun Rao

On Thu, May 3, 2018 at 11:06 AM, P. Taylor Goetz <ptgo...@gmail.com> wrote:

> Great, thanks!
>
> If you have any trouble posting to the wiki, you can forward the report to
> me and I will post it.
>
> -Taylor
>
> > On May 3, 2018, at 2:01 PM, Gian Merlino <g...@apache.org> wrote:
> >
> > I can take up this one.
> >
> > On Thu, May 3, 2018 at 10:48 AM, P. Taylor Goetz <ptgo...@gmail.com>
> wrote:
> >
> >> The druid podling report was due yesterday. Is anyone working on it or
> >> willing to take this up?
> >>
> >> -Taylor
> >>
> >>> On Apr 27, 2018, at 11:01 PM, johndam...@apache.org wrote:
> >>>
> >>> Dear podling,
> >>>
> >>> This email was sent by an automated system on behalf of the Apache
> >>> Incubator PMC. It is an initial reminder to give you plenty of time to
> >>> prepare your quarterly board report.
> >>>
> >>> The board meeting is scheduled for Wed, 16 May 2018, 10:30 am PDT.
> >>> The report for your podling will form a part of the Incubator PMC
> >>> report. The Incubator PMC requires your report to be submitted 2 weeks
> >>> before the board meeting, to allow sufficient time for review and
> >>> submission (Wed, May 02).
> >>>
> >>> Please submit your report with sufficient time to allow the Incubator
> >>> PMC, and subsequently board members to review and digest. Again, the
> >>> very latest you should submit your report is 2 weeks prior to the board
> >>> meeting.
> >>>
> >>> Candidate names should not be made public before people are actually
> >>> elected, so please do not include the names of potential committers or
> >>> PPMC members in your report.
> >>>
> >>> Thanks,
> >>>
> >>> The Apache Incubator PMC
> >>>
> >>> Submitting your Report
> >>>
> >>> --
> >>>
> >>> Your report should contain the following:
> >>>
> >>> *   Your project name
> >>> *   A brief description of your project, which assumes no knowledge of
> >>>   the project or necessarily of its field
> >>> *   A list of the three most important issues to address in the move
> >>>   towards graduation.
> >>> *   Any issues that the Incubator PMC or ASF Board might wish/need to
> be
> >>>   aware of
> >>> *   How has the community developed since the last report
> >>> *   How has the project developed since the last report.
> >>> *   How does the podling rate their own maturity.
> >>>
> >>> This should be appended to the Incubator Wiki page at:
> >>>
> 

Re: Podling Report Reminder - May 2018

2018-05-03 Thread Gian Merlino
I have posted this on https://wiki.apache.org/incubator/May2018.

On Thu, May 3, 2018 at 1:03 PM, Gian Merlino <g...@apache.org> wrote:

> Here's a draft (below). Since it's already overdue, I'll post it on the
> wiki in a few hours.
>
> Druid
>
> Druid is a high-performance, column-oriented, distributed data store.
>
> Druid has been incubating since 2018-02-28.
>
> Three most important issues to address in the move towards graduation:
>
>  1. Complete SGA for current sources and ICLAs for current committers.
>  2. Move the source code and website to Apache infrastructure.
>  3. Plan and execute our first Apache release.
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
>
> - None.
>
> How has the community developed since the last report?
>
> - We have disabled posting on our pre-Apache development list.
> - We have updated our pre-Apache community page (
> http://druid.io/community/) to
>   include information about incubation.
> - We have set up a placeholder site on https://druid.apache.org/.
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>   are still ongoing on https://github.com/druid-io/druid.
>
> How has the project developed since the last report?
>
> - Since the last report there have been 69 commits from 23 individuals.
> - We have conducted a vote to put out a release candidate 0.12.1-rc1. This
>   release candidate is being done outside the Incubator. We also
> anticipate the
>   0.12.1 release to be done outside the Incubator.
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [X] Initial setup
>   [ ] Working towards first release
>   [ ] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
>
> When were the last committers or PPMC members elected?
>
> - Project is still functioning with the initial set of committers.
>
> Signed-off-by:
>
>   [ ](druid) Julian Hyde
>   [ ](druid) P. Taylor Goetz
>   [ ](druid) Jun Rao
>
> On Thu, May 3, 2018 at 11:06 AM, P. Taylor Goetz <ptgo...@gmail.com>
> wrote:
>
>> Great, thanks!
>>
>> If you have any trouble posting to the wiki, you can forward the report
>> to me and I will post it.
>>
>> -Taylor
>>
>> > On May 3, 2018, at 2:01 PM, Gian Merlino <g...@apache.org> wrote:
>> >
>> > I can take up this one.
>> >
>> > On Thu, May 3, 2018 at 10:48 AM, P. Taylor Goetz <ptgo...@gmail.com>
>> wrote:
>> >
>> >> The druid podling report was due yesterday. Is anyone working on it or
>> >> willing to take this up?
>> >>
>> >> -Taylor
>> >>
>> >>> On Apr 27, 2018, at 11:01 PM, johndam...@apache.org wrote:
>> >>>
>> >>> Dear podling,
>> >>>
>> >>> This email was sent by an automated system on behalf of the Apache
>> >>> Incubator PMC. It is an initial reminder to give you plenty of time to
>> >>> prepare your quarterly board report.
>> >>>
>> >>> The board meeting is scheduled for Wed, 16 May 2018, 10:30 am PDT.
>> >>> The report for your podling will form a part of the Incubator PMC
>> >>> report. The Incubator PMC requires your report to be submitted 2 weeks
>> >>> before the board meeting, to allow sufficient time for review and
>> >>> submission (Wed, May 02).
>> >>>
>> >>> Please submit your report with sufficient time to allow the Incubator
>> >>> PMC, and subsequently board members to review and digest. Again, the
>> >>> very latest you should submit your report is 2 weeks prior to the
>> board
>> >>> meeting.
>> >>>
>> >>> Candidate names should not be made public before people are actually
>> >>> elected, so please do not include the names of potential committers or
>> >>> PPMC members in your report.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> The Apache Incubator PMC
>> >>>
>> >>> Submitting your Report
>> >>>
>> >>> --
>> >>>
>> >>> Your report should contain the following:
>> >>>
>> >>> *   Your project name
>> >>> *   A brief description of your project, whi

Re: Druid repo migration plan

2018-07-03 Thread Gian Merlino
Here is the ticket, btw: https://issues.apache.org/jira/browse/INFRA-16674.
Repo move should be happening real soon now!

On Mon, Jul 2, 2018 at 11:55 PM Gian Merlino  wrote:

> Our infra ticket is progressing along and it looks like we're just about
> ready to pull the trigger on moving the repo. So, committers, please make
> sure your ASF gitbox stuff is working: https://gitbox.apache.org/setup/
>
> On Fri, Jun 22, 2018 at 1:22 PM Gian Merlino  wrote:
>
>> Thanks for the tips, Max!! I think we are, hopefully, doing okay on some
>> of these. My thoughts inline.
>>
>> > Since you need elevated rights on both
>> orgs to move the repo (say airbnb and apache) and that both parties aren't
>> ok with that, it's typical to use a middleman org like `apacheinfra`.
>>
>> Luckily, our org is limited to just Druid stuff (
>> https://github.com/druid-io) so we should be OK to add Apache Infra
>> people with elevated rights.
>>
>> > * make merge hook checks optional, so that if coverage, travis, or code
>> quality checks do not prevent merging, since it's likely those check won't
>> trigger and as a non-admin you won't be able to force-merge
>>
>> We have a couple (Travis and TeamCity) and they're already optional.
>>
>> > * consider unprotecting protected branches so that you can push to
>> master
>> if controlling master is important in your workflow. This way you can
>> effectively merge PRs without clicking the button on GH.
>>
>> Master _is_ important. Although I think if we can't do PRs, then pushing
>> directly to master is probably not going to be too helpful anyway (the PRs
>> are essential to our code review workflow). So I think we have to hope for
>> the best here?
>>
>> > * make sure core committers have their Gitbox access setup, I think it
>> can
>> be a bit tricky and may involve your mentor / infra pulling some levers on
>> whimsy
>>
>> I went through this process (via https://gitbox.apache.org/setup/) to
>> get the ability to push to
>> https://github.com/apache/incubator-druid-website, which is powering
>> https://druid.apache.org/. It took a little while and was kind of
>> confusing but it does work now. Other Druid committers: sounds like getting
>> set up on GitBox early is a good thing, so please check it out!
>>
>> On Fri, Jun 22, 2018 at 11:51 AM Maxime Beauchemin <
>> maximebeauche...@gmail.com> wrote:
>>
>>> @julian gotcha, I thought this was a more official vote
>>>
>>> The Superset GH move INFRA ticket shows how the move can be really
>>> tricky/slow/disruptive. There was quite a period of instability for us
>>> and
>>> a lot of slow back and forth with Apache infra. Hopefully the process has
>>> been ironed out since then. Be prepared and go into it knowing that you
>>> may
>>> not be able to merge PRs for days/weeks.
>>> https://issues.apache.org/jira/browse/INFRA-14267
>>>
>>> On the ticket you open with INFRA, make it really clear what your GH
>>> integrations are and validate that they are all approved/supported by
>>> Apache prior to the move. Some integrations (like codeclimate) require
>>> rights on the GH org (Apache) and INFRA is categoric against that. If
>>> some
>>> services aren't supported make sure to disable the integrations prior to
>>> the move, find replacement services. Also make sure INFRA will
>>> adjust/tweak
>>> the integration post move as you likely need admin rights to do so.
>>>
>>> A caveat is around the "redirect chain" on GH. This is what allows
>>> hitting `
>>> github.com/airbnb/superset` <http://github.com/airbnb/superset> to
>>> redirect to `
>>> github.com/apache/incubator-superset`
>>> <http://github.com/apache/incubator-superset> to redirect to the right
>>> place. This
>>> also allows `git remote`s to just work post move. This redirect chain is
>>> fragile and can break in some cases. Since you need elevated rights on
>>> both
>>> orgs to move the repo (say airbnb and apache) and that both parties
>>> aren't
>>> ok with that, it's typical to use a middleman org like `apacheinfra`.
>>> They
>>> grant you admin right to that org and you move the repo to there, and
>>> they
>>> do the second part. If, post move, the middleman was to fork a repo with
>>> the same name, or create one, it would break the redirect chain.
>>> Something
>>> INFRA should be aware of at this point and cautious

Re: Podling Report (August 2018)

2018-08-02 Thread Gian Merlino
That sounds like a good question for gene...@incubator.apache.org.

On Thu, Aug 2, 2018 at 2:22 PM Jonathan Wei  wrote:

> Hm, looks like the wiki page https://wiki.apache.org/incubator/August2018
> still
> doesn't exist, any idea when it'll be up?
>
>
> On Thu, Aug 2, 2018 at 1:04 PM, Julian Hyde  wrote:
>
> > Please email the mentors (or the dev list) when it is posted to the wiki
> > and is ready for sign-off.
> >
> > Julian
> >
> >
> > > On Aug 1, 2018, at 7:19 PM, Jonathan Wei  wrote:
> > >
> > > I don't see the incubator wiki page for August 2018 up yet, so I'll
> post
> > > the current report here for now:
> > >
> > >
> > > Druid Podling Report (August 2018)
> > > 
> > >
> > > Druid is a high-performance, column-oriented, distributed data store.
> > >
> > > Druid has been incubating since 2018-02-28.
> > >
> > > Three most important issues to address in the move towards graduation:
> > >
> > > 1. Plan and execute our first Apache release.
> > > 2. Move the website to Apache infrastructure.
> > > 3. Expanding the community and adding more committers
> > >
> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> > aware
> > > of?
> > >
> > > - None.
> > >
> > > How has the community developed since the last report?
> > >
> > > - A healthy, constant flow of bug fixes, quality improvements and new
> > > features
> > >  are still ongoing at https://github.com/apache/incubator-druid.
> > > - Our next community meetup has been scheduled for August 8.
> > >
> > > How has the project developed since the last report?
> > >
> > > - This report covers activity since the May 2018 report
> > > - Source has been migrated to Apache infrastructure
> > > - License header updates are almost complete
> > > - Since the last report there have been 93 commits from 20 individuals.
> > > - We have released 0.12.1, a non-incubator release.
> > > - We currently voting on the 0.12.2 non-incubator bug fix release. This
> > > will be our final non-incubator release.
> > >
> > > How would you assess the podling's maturity?
> > > Please feel free to add your own commentary.
> > >
> > >  [ ] Initial setup
> > >  [X] Working towards first release
> > >  [ ] Community building
> > >  [ ] Nearing graduation
> > >  [ ] Other:
> > >
> > > Date of last release:
> > >
> > > - Druid 0.12.1 on 2018-06-08 (non-Apache release)
> > > - No official Apache release yet since beginning Apache Incubation
> > >
> > > When were the last committers or PPMC members elected?
> > >
> > > - Project is still functioning with the initial set of committers.
> > >
> > > On Tue, Jul 31, 2018 at 4:02 PM, Jonathan Wei 
> wrote:
> > >
> > >> Thanks for reviewing.
> > >>
> > >> Our last meetup was in March, but we have an upcoming meetup on
> August 8
> > >> (after the report is due I assume), should we mention that in this
> > report?
> > >>
> > >> I noticed the incubator wiki page for August 2018 hasn't been created
> > yet,
> > >> does anyone know if that's expected to be up soon? (Not sure on what
> > exact
> > >> date our podling report is due)
> > >>
> > >> - Jon
> > >>
> > >>
> > >> On Sun, Jul 29, 2018 at 10:30 AM, Julian Hyde  >
> > >> wrote:
> > >>
> > >>> Thanks - this looks good.
> > >>>
> > >>> I’d not include the url for the case to replace license headers.
> > Reports
> > >>> rarely include urls whose sole purpose is to prove statements in the
> > >>> report.
> > >>>
> > >>> If there have been meet ups/talks about Druid, mention them. Druid
> has
> > a
> > >>> vibrant community, as a result of your ongoing community building
> > >>> activities such as talks, but still, you should take credit for them.
> > >>>
> > >>> Julian
> > >>>
> >  On Jul 28, 2018, at 10:24 AM, Jonathan Wei 
> wrote:
> > 
> >  Hi all,
> > 
> >  I'm posting a draft of the August 2018 report (which covers activity
> > >>> since
> >  our last report in May):
> > 
> > 
> > 
> >  Druid Podling Report (August 2018)
> >  
> > 
> >  Druid is a high-performance, column-oriented, distributed data
> store.
> > 
> >  Druid has been incubating since 2018-02-28.
> > 
> >  Three most important issues to address in the move towards
> graduation:
> > 
> >  1. Plan and execute our first Apache release.
> >  2. Move the website to Apache infrastructure.
> >  3. Expanding the community and adding more committers
> > 
> >  Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to
> be
> > >>> aware
> >  of?
> > 
> >  - None.
> > 
> >  How has the community developed since the last report?
> > 
> >  - A healthy, constant flow of bug fixes, quality improvements and
> new
> >  features
> >  are still ongoing at https://github.com/apache/incubator-druid.
> > 
> >  How has the project developed since the last report?
> > 
> >  - This report covers activity since the May 2018 report
> >  - Source has been 

Re: Issue with documentation

2018-07-25 Thread Gian Merlino
Hey Himanshu,

The docs are kept in the main Druid repo and released to the site when we
do Druid releases. The source for the doc you mentioned is here:
https://github.com/apache/incubator-druid/blob/master/docs/content/tutorials/quickstart.md

If you still see the error in master, please raise a pull request to fix it
-- and thank you!

On Wed, Jul 25, 2018 at 3:30 PM Himanshu Pandey 
wrote:

> Hi Team,
>
> I am new to druid and would like to contribute to the project.
>
> While setting it up on my local, I found one minor documentation issue
> here:
>
> http://druid.io/docs/0.12.1/tutorials/quickstart.html
>
> How can I proceed with creating the issue and possibly fixing it?
>
> *Thanks & Regards,*
> *Himanshu Pandey*
> *Cell: +1 (408) 644 - 8765*
>


Re: About creating 0.12.2-rc1

2018-08-05 Thread Gian Merlino
+1, and fwiw, it looks like Apache projects don't always need to do votes
for creating release candidates. For example on the Calcite mailing list I
see votes for _final_ releases, but the release candidates seem to be
created and uploaded without a vote. There is generally some discussion on
the list about whether it's a good time to do a release candidate, but I
don't generally see formal votes. I think something similar could work for
us in the future and could help us get releases out quicker.

On Fri, Aug 3, 2018 at 9:38 PM Prashant Deva 
wrote:

> +1
> Prashant
>
>
> On Fri, Aug 3, 2018 at 7:11 PM Niketh Sabbineni <
> niketh.sabbin...@gmail.com>
> wrote:
>
> > +1
> >
> > Looking forward to this
> >
> > On Fri, Aug 3, 2018 at 7:09 PM Jihoon Son  wrote:
> >
> > > Hi folks,
> > >
> > > Releasing 0.12.2 has been delayed because, fortunately, we could find
> > more
> > > bugs to be fixed before release.
> > >
> > > Currently, there remains only one PR (
> > > https://github.com/apache/incubator-druid/pull/6106 ) to be merged for
> > > 0.12.2. Once the Travis CI passes, I'll merge that PR shortly. Then,
> > we're
> > > ready for 0.12.2-rc1 release.
> > >
> > > So, I think it's time to ask your opinion about creating 0.12.2-rc1
> > without
> > > the release vote. I think it makes sense because we have already had
> two
> > > votes (
> > >
> > >
> >
> https://lists.apache.org/thread.html/a96f2e39506118be26184bd950bc51d360107d75e9ac547d8597817a@%3Cdev.druid.apache.org%3E
> > > ,
> > >
> > >
> >
> https://lists.apache.org/thread.html/11a50f22e7669a527625e190bebbe50b7586dd72733c3bf6a1024c02@%3Cdev.druid.apache.org%3E
> > > )
> > > for 0.12.2-rc1 release and there's no objection.
> > >
> > > If there's no objection for this for 48 hours, I'll start 0.12.2-rc1
> > > release.
> > >
> > > Best,
> > > Jihoon
> > >
> > --
> > Niketh Sabbineni
> >
>


Re: Issue with documentation

2018-07-26 Thread Gian Merlino
Just a pull request is good enough.

On Wed, Jul 25, 2018 at 4:23 PM Himanshu Pandey 
wrote:

> Thanks Gian! Do I still need to create a new issue in GitHub for this ? or
> just a pull request will suffice?
>
>
>
> *Thanks & Regards,*
> *Himanshu Pandey*
>
>
> On Wed, Jul 25, 2018 at 3:39 PM, Gian Merlino  wrote:
>
> > Hey Himanshu,
> >
> > The docs are kept in the main Druid repo and released to the site when we
> > do Druid releases. The source for the doc you mentioned is here:
> > https://github.com/apache/incubator-druid/blob/master/
> > docs/content/tutorials/quickstart.md
> >
> > If you still see the error in master, please raise a pull request to fix
> it
> > -- and thank you!
> >
> > On Wed, Jul 25, 2018 at 3:30 PM Himanshu Pandey <
> > himanshu.pande...@gmail.com>
> > wrote:
> >
> > > Hi Team,
> > >
> > > I am new to druid and would like to contribute to the project.
> > >
> > > While setting it up on my local, I found one minor documentation issue
> > > here:
> > >
> > > http://druid.io/docs/0.12.1/tutorials/quickstart.html
> > >
> > > How can I proceed with creating the issue and possibly fixing it?
> > >
> > > *Thanks & Regards,*
> > > *Himanshu Pandey*
> > > *Cell: +1 (408) 644 - 8765*
> > >
> >
>


Re: Docs for 'master'

2018-08-10 Thread Gian Merlino
That sounds nice, I am all for it. It should ideally be automated, maybe
with Jenkins.

On Fri, Aug 10, 2018 at 5:13 PM Jihoon Son  wrote:

> Hi all,
>
> We currently have the following system for docs.
>
> - http://druid.io/docs/{version}: docs for a specific version
> - http://druid.io/docs/latest: latest docs. These docs are basically based
> on the latest release, but it can contain more recent docs which are not
> released yet.
>
> This system sometimes makes people confused because 'latest' can be
> interpreted as the 'latest release'. So, I'm proposing a new docs for the
> 'master' branch which always shows the most recent docs. We might call this
> 'dev' or something better. Then, we would have the following docs:
>
> - http://druid.io/docs/{version}: docs for a specific version
> - http://druid.io/docs/latest: docs for the latest release
> - http://druid.io/docs/dev: docs for the master branch
>
> IMO, this system would have some benefits of less confusion as well as
> quickly publishing recent documents. Quick document publishing is important
> for developers as well as users because we also need to refer documents to
> test/improve/develop/review some features.
>
> Welcome any thoughts.
>
> Best,
> Jihoon
>


Re: Changing release process for release candidates

2018-08-06 Thread Gian Merlino
It sounds good to me, it streamlines things a bit and seems to be what
other projects are doing. As Julian pointed out in the other thread it
still pays to have someone "managing" the release and to have some
discussion about when's the right time to start a release branch. The
"release manager" job has rotated through a few different people over the
past few major releases, which is good.

On Mon, Aug 6, 2018 at 3:20 PM Jihoon Son  wrote:

> Hi all,
>
> Our current release process for RCs begins with a vote. It usually takes up
> a few days, but is actually not a mandatory process for creating RCs. If we
> can reach consensus without explicit votes, we can expect the faster
> release in the future.
>
> The original discussion is available at
>
> https://lists.apache.org/thread.html/d887f0c6e23f1625e549389c08a9a5e74a7a24db4d5e007b6e8d10f6@%3Cdev.druid.apache.org%3E
> .
>
> Welcome any idea.
>
> Best,
> Jihoon
>


Re: Druid 0.12.2 release vote

2018-08-07 Thread Gian Merlino
+1. Thank you Jihoon for running this release.

On Tue, Aug 7, 2018 at 10:04 AM Jihoon Son  wrote:

> Sure,
>
> the release note is available here:
> https://github.com/apache/incubator-druid/issues/6116.
>
> Best,
> Jihoon
>
> On Tue, Aug 7, 2018 at 10:02 AM Charles Allen  wrote:
>
> > ((don't let this ask block the release))
> >
> > Is there a way to get a preview of what the release notice will look
> like?
> >
> > On Mon, Aug 6, 2018 at 3:38 PM Fangjin Yang  wrote:
> >
> > > +1
> > >
> > > On Mon, Aug 6, 2018 at 3:03 PM, Jihoon Son 
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Druid 0.12.2-rc1 (http://druid.io/downloads.html) is available now,
> > and
> > > I
> > > > think it's time to vote on the 0.12.2 release. Please note that
> 0.12.2
> > is
> > > > not an ASF release.
> > > >
> > > > Here is my +1.
> > > >
> > > > Best,
> > > > Jihoon
> > > >
> > >
> >
>


Re: Podling Report (August 2018)

2018-08-06 Thread Gian Merlino
It looks like the page is up now, so I posted this report there.

On Thu, Aug 2, 2018 at 3:03 PM Gian Merlino  wrote:

> That sounds like a good question for gene...@incubator.apache.org.
>
> On Thu, Aug 2, 2018 at 2:22 PM Jonathan Wei  wrote:
>
>> Hm, looks like the wiki page https://wiki.apache.org/incubator/August2018
>> still
>> doesn't exist, any idea when it'll be up?
>>
>>
>> On Thu, Aug 2, 2018 at 1:04 PM, Julian Hyde  wrote:
>>
>> > Please email the mentors (or the dev list) when it is posted to the wiki
>> > and is ready for sign-off.
>> >
>> > Julian
>> >
>> >
>> > > On Aug 1, 2018, at 7:19 PM, Jonathan Wei  wrote:
>> > >
>> > > I don't see the incubator wiki page for August 2018 up yet, so I'll
>> post
>> > > the current report here for now:
>> > >
>> > >
>> > > Druid Podling Report (August 2018)
>> > > 
>> > >
>> > > Druid is a high-performance, column-oriented, distributed data store.
>> > >
>> > > Druid has been incubating since 2018-02-28.
>> > >
>> > > Three most important issues to address in the move towards graduation:
>> > >
>> > > 1. Plan and execute our first Apache release.
>> > > 2. Move the website to Apache infrastructure.
>> > > 3. Expanding the community and adding more committers
>> > >
>> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
>> > aware
>> > > of?
>> > >
>> > > - None.
>> > >
>> > > How has the community developed since the last report?
>> > >
>> > > - A healthy, constant flow of bug fixes, quality improvements and new
>> > > features
>> > >  are still ongoing at https://github.com/apache/incubator-druid.
>> > > - Our next community meetup has been scheduled for August 8.
>> > >
>> > > How has the project developed since the last report?
>> > >
>> > > - This report covers activity since the May 2018 report
>> > > - Source has been migrated to Apache infrastructure
>> > > - License header updates are almost complete
>> > > - Since the last report there have been 93 commits from 20
>> individuals.
>> > > - We have released 0.12.1, a non-incubator release.
>> > > - We currently voting on the 0.12.2 non-incubator bug fix release.
>> This
>> > > will be our final non-incubator release.
>> > >
>> > > How would you assess the podling's maturity?
>> > > Please feel free to add your own commentary.
>> > >
>> > >  [ ] Initial setup
>> > >  [X] Working towards first release
>> > >  [ ] Community building
>> > >  [ ] Nearing graduation
>> > >  [ ] Other:
>> > >
>> > > Date of last release:
>> > >
>> > > - Druid 0.12.1 on 2018-06-08 (non-Apache release)
>> > > - No official Apache release yet since beginning Apache Incubation
>> > >
>> > > When were the last committers or PPMC members elected?
>> > >
>> > > - Project is still functioning with the initial set of committers.
>> > >
>> > > On Tue, Jul 31, 2018 at 4:02 PM, Jonathan Wei 
>> wrote:
>> > >
>> > >> Thanks for reviewing.
>> > >>
>> > >> Our last meetup was in March, but we have an upcoming meetup on
>> August 8
>> > >> (after the report is due I assume), should we mention that in this
>> > report?
>> > >>
>> > >> I noticed the incubator wiki page for August 2018 hasn't been created
>> > yet,
>> > >> does anyone know if that's expected to be up soon? (Not sure on what
>> > exact
>> > >> date our podling report is due)
>> > >>
>> > >> - Jon
>> > >>
>> > >>
>> > >> On Sun, Jul 29, 2018 at 10:30 AM, Julian Hyde <
>> jhyde.apa...@gmail.com>
>> > >> wrote:
>> > >>
>> > >>> Thanks - this looks good.
>> > >>>
>> > >>> I’d not include the url for the case to replace license headers.
>> > Reports
>> > >>> rarely include urls whose sole purpose is to prove statements in the
>> > >>> report.
>> > >>>
>> > >>> If there have been meet ups/t

Re: Different query results for 0.12.2 and 0.10.1

2018-08-06 Thread Gian Merlino
Hi Samarth,

The doubleSum difference is likely due to the fact that before 0.11.0,
Druid read values out of columns as 32 bit floats and then cast them to 64
bit doubles. Now it can read them directly as 64 bit doubles. And actually,
it can _store_ floating point values as 64 bit doubles too, although this
won't be enabled by default until 0.13.0 (see
http://druid.io/docs/latest/configuration/index.html#double-column-storage
for how to enable it today).

Some thoughts on specific query types:

- The ordering of select results can vary due to differing choices about
which segments to read first. The results will stay in time order, but two
results with the same timestamp might swap positions. Btw, if you don't
need the strict time ordering guarantees, consider Scan queries (
http://druid.io/docs/latest/querying/scan-query.html) which are much
lighter in terms of memory usage.
- The exact ranking and values of TopN results can also vary, since topNs
are approximate and their results can vary based on which segments are
processed in which order and on which servers.
- GroupBy I would not expect to vary: what kinds of differences are you
seeing there?
- Search I'm not familiar with enough to think of a reason why it should or
shouldn't vary.

One thing you can do to try to get more consistent results for comparison
is add "bySegment" : true to your context. This will skip the merging step,
and just return sub-results for each segment individually. Most of the
potential variation is introduced in the merging step, so this should give
you more consistent results. With the caveat that it means you won't be
getting to test the merging step.

On Sun, Aug 5, 2018 at 10:55 PM Samarth Jain  wrote:

> I have an internal test harness setup that I am using for testing version
> upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
> executing the same query against the same data sources(on different druid
> clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
> seen this happen for search, group by, top n, select query types. The
> common part in all such queries is that they have a paging spec with
> descending set to false.
>
> "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> "desceding": false
>
> My guess is that data distribution is slightly differently within the two
> clusters which combined with paging spec is causing this mismatch. Is my
> guess correct? If so, is there a way to make such kind of testing
> deterministic.
>
> The other thing that I observed is that with doubleSum aggregation type,
> 0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
> to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> change in precision?
>


Re: 0.10.1 and 0.12.2 group by/search/select/top n query results

2018-08-06 Thread Gian Merlino
Hi Samarth,

It looks like you posted this message twice with two different subjects. I
responded to the other one, titled "Different query results for 0.12.2 and
0.10.1".

On Sun, Aug 5, 2018 at 9:41 PM Samarth Jain  wrote:

> I have an internal test harness setup that I am using for testing version
> upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
> executing the same query against the same data source gives slightly
> different results for 0.10.1 and 0.12.2. I have seen this happen for
> search, group by, top n, select query types. The common part in all such
> queries is that they have a paging spec with descending set to false.
>
> "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> "desceding": false
>
> My guess is that the data is distributed slightly differently within the
> two clusters which is causing this mismatch. Is my guess correct? If so, is
> there a way to make this comparison deterministic.
>
> The other thing that I observed is that with doubleSum aggregation type,
> 0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
> to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> change in precision?
>


Re: About creating 0.12.2-rc1

2018-08-06 Thread Gian Merlino
Thanks Jihoon! Would you mind starting the other thread too when you get a
chance?

On Mon, Aug 6, 2018 at 10:13 AM Jihoon Son  wrote:

> Thanks guys.
>
> I'm creating 0.12.2-rc1 now.
>
> Regarding creating an RC without vote, I think it's worth to have a
> discussion in another thread to make sure everyone knows about the new RC
> release process.
>
> Best,
> Jihoon
>
> On Sun, Aug 5, 2018 at 10:30 AM Julian Hyde 
> wrote:
>
> > Gian is correct. Creating an RC doesn’t require a vote. It does require a
> > release manager. Usually in Calcite we determine the timeframe of the
> > release, and choose an RM, by a discussion that reaches consensus without
> > an explicit vote.
> >
> > The RM may do a little “traffic control”, asking whether people consider
> > the branch is in good shape, and perhaps asking people to stop pushing,
> > again by a non-vote email thread.
> >
> > Julian
> >
> > > On Aug 5, 2018, at 8:56 AM, Gian Merlino  wrote:
> > >
> > > +1, and fwiw, it looks like Apache projects don't always need to do
> votes
> > > for creating release candidates. For example on the Calcite mailing
> list
> > I
> > > see votes for _final_ releases, but the release candidates seem to be
> > > created and uploaded without a vote. There is generally some discussion
> > on
> > > the list about whether it's a good time to do a release candidate, but
> I
> > > don't generally see formal votes. I think something similar could work
> > for
> > > us in the future and could help us get releases out quicker.
> > >
> > > On Fri, Aug 3, 2018 at 9:38 PM Prashant Deva 
> > > wrote:
> > >
> > >> +1
> > >> Prashant
> > >>
> > >>
> > >> On Fri, Aug 3, 2018 at 7:11 PM Niketh Sabbineni <
> > >> niketh.sabbin...@gmail.com>
> > >> wrote:
> > >>
> > >>> +1
> > >>>
> > >>> Looking forward to this
> > >>>
> > >>>> On Fri, Aug 3, 2018 at 7:09 PM Jihoon Son 
> > wrote:
> > >>>>
> > >>>> Hi folks,
> > >>>>
> > >>>> Releasing 0.12.2 has been delayed because, fortunately, we could
> find
> > >>> more
> > >>>> bugs to be fixed before release.
> > >>>>
> > >>>> Currently, there remains only one PR (
> > >>>> https://github.com/apache/incubator-druid/pull/6106 ) to be merged
> > for
> > >>>> 0.12.2. Once the Travis CI passes, I'll merge that PR shortly. Then,
> > >>> we're
> > >>>> ready for 0.12.2-rc1 release.
> > >>>>
> > >>>> So, I think it's time to ask your opinion about creating 0.12.2-rc1
> > >>> without
> > >>>> the release vote. I think it makes sense because we have already had
> > >> two
> > >>>> votes (
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/a96f2e39506118be26184bd950bc51d360107d75e9ac547d8597817a@%3Cdev.druid.apache.org%3E
> > >>>> ,
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/11a50f22e7669a527625e190bebbe50b7586dd72733c3bf6a1024c02@%3Cdev.druid.apache.org%3E
> > >>>> )
> > >>>> for 0.12.2-rc1 release and there's no objection.
> > >>>>
> > >>>> If there's no objection for this for 48 hours, I'll start 0.12.2-rc1
> > >>>> release.
> > >>>>
> > >>>> Best,
> > >>>> Jihoon
> > >>>>
> > >>> --
> > >>> Niketh Sabbineni
> > >>>
> > >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > For additional commands, e-mail: dev-h...@druid.apache.org
> >
> >
>


Re: Druid 0.12.2 release vote

2018-08-09 Thread Gian Merlino
Nice!!

Although I don't see the graphic attached, maybe the mailing list ate it?

On Wed, Aug 8, 2018 at 4:15 PM Charles Allen 
wrote:

> Blue is 0.12.2 with some minor backports not perf related. Red is from the
> 0.11.x series. This is effectively a bucketed PDF of the query times for a
> live cluster with Timeseries queries as self-reported by historical nodes.
> I mentioned elsewhere I'm not convinced query/time is a good proxy for user
> experience, but it does provide a good baseline for comparisons between
> versions. Low query times are suspected due to some aggressive caching or
> complete node misses (node very little data for that time range for that
> datasource). And high query time outliers are often the result of bad GC.
>
> On our side there is a new java version going out with the 0.12.2
> deployment so it is unclear how much is attributed to the new java version
> and how much is attributed to the druid jars or other config changes.
> Overall things seem to consistently display a small % improvement in the
> mean with our internal 0.12.2 release. This is good!
>
> Cheers,
> Charles Allen
>
> [image: Screen Shot 2018-08-08 at 4.01.24 PM.png]
>
>
> On Wed, Aug 8, 2018 at 3:11 PM David Lim  wrote:
>
>> +1, thank you!
>>
>> On Wed, Aug 8, 2018 at 3:16 PM Jonathan Wei  wrote:
>>
>> > +1, thanks Jihoon!
>> >
>> > On Wed, Aug 8, 2018 at 1:18 PM, Jihoon Son 
>> wrote:
>> >
>> > > Awesome! Thanks Charles!
>> > >
>> > > Jihoon
>> > >
>> > > On Wed, Aug 8, 2018 at 1:16 PM Gian Merlino  wrote:
>> > >
>> > > > Thanks, it will be nice to see!
>> > > >
>> > > > On Wed, Aug 8, 2018 at 1:15 PM Charles Allen <
>> charles.al...@snap.com
>> > > > .invalid>
>> > > > wrote:
>> > > >
>> > > > > I don't think it should be a blocker to release, but I have to run
>> > perf
>> > > > > tests for rollouts anyways so I figured I'd publish what I find
>> :-P
>> > > > >
>> > > > > Cheers,
>> > > > > Charles Allen
>> > > > >
>> > > > >
>> > > > > On Wed, Aug 8, 2018 at 12:33 PM Gian Merlino 
>> > wrote:
>> > > > >
>> > > > > > That being said, Charles I am definitely looking forward to your
>> > > report
>> > > > > of
>> > > > > > what the upgrade from 0.11 -> 0.12.2-rc1 is like in your
>> cluster!
>> > > > > >
>> > > > > > On Wed, Aug 8, 2018 at 12:30 PM Gian Merlino 
>> > > wrote:
>> > > > > >
>> > > > > > > My thought is that recently we have started doing small
>> bug-fix
>> > > > > releases
>> > > > > > > more often (0.12.1 and 0.12.2 were both small releases) and I
>> > think
>> > > > it
>> > > > > > > makes sense to continue this practice. It makes sense to get
>> them
>> > > out
>> > > > > > > quickly, since shipping bug fixes is good. IMO trying to
>> validate
>> > > bug
>> > > > > fix
>> > > > > > > releases within the customary Apache style 72 hour voting
>> period
>> > > is a
>> > > > > > good
>> > > > > > > goal.
>> > > > > > >
>> > > > > > > On the other hand we do strive to put out high quality
>> releases,
>> > > and
>> > > > we
>> > > > > > > don't want bug fix releases to introduce regressions. Testing
>> > every
>> > > > > > single
>> > > > > > > patch in real clusters is an important part of that. All I
>> can do
>> > > is
>> > > > > > > encourage people running real clusters to deploy RCs as fast
>> as
>> > > they
>> > > > > can!
>> > > > > > > Fwiw, we have already incorporated all the 0.12.2 patches into
>> > our
>> > > > > Imply
>> > > > > > > distro of Druid and already have a good number users running
>> > them.
>> > > So
>> > > > > my
>> > > > > > +1
>> > > > > > > earlier incorporated knowledge that the patches have been
>> > validated
>> > > >

Re: Nightly build!

2018-08-09 Thread Gian Merlino
I found http://www.apache.org/legal/release-policy.html#host-rc which looks
like the policy we should follow if we start doing nightly builds. I guess
we shouldn't archive nightly builds, since there would be too many.

On Thu, Aug 9, 2018 at 6:02 PM Jihoon Son  wrote:

> Hi all,
>
> Nightly build would be useful for folks who want to stay on the bleeding
> edge of Druid. I'm thinking to add a Jenkins job to
> https://builds.apache.org/ which checks every hour that there are changes
> in the master branch and builds a new build. Once the build succeeds, the
> binary is archived, so that we can add a link to the binary to our web
> page. If the build fails, the notification would be sent to
> comm...@druid.apache.org.
>
> Welcome any thoughts.
>
> Best,
> Jihoon
>


Re: CLA still required?

2018-08-17 Thread Gian Merlino
I think since the source is migrated now, what sounds right to me is to
accept Apache CLAs/SGAs for new committers, corporate contributors, and
major code transfers (like any other Apache project). And I think we
probably don't need to keep collecting our old CLAs, especially not for
minor contributions. Happy to get input from other people on this as it is
not my area of expertise.

On Tue, Aug 14, 2018 at 5:09 PM Jonathan Wei  wrote:

> Now that we've migrated the sources (but are still incubating), should we
> still ask new contributors to sign http://druid.io/community/cla.html?
>
>
> On Fri, Jun 1, 2018 at 12:52 PM, Gian Merlino  wrote:
>
> > Yes we are still collecting them, although once we are fully migrated to
> > ASF, then we won't anymore (as per ASF policy - as I understand it - CLAs
> > are only required for committers).
> >
> > On Fri, Jun 1, 2018 at 6:27 AM, Pierre Lacave 
> wrote:
> >
> > > Hi,
> > >
> > > With the incubation ongoing, do you still require CLA signed for
> > > contributions?
> > >
> > > Thanks
> > >
> >
>


Re: Late June/July podling reports

2018-07-18 Thread Gian Merlino
OMG! I don't see a report reminder for July. Is that not something that is
happening anymore? I was relying on getting one of those…

IMO, there is no reason to write the older June report. We missed it, and
that is sad, but it is probably not super interesting to look back and see
what was happening then. It seems more useful to write about the current
state today. I'd rewrite the first bullet (" 1. Move the source code and
website to Apache infrastructure." to reflect that we actually have moved
source code already, and include "done migrating source code" in the "how
has the project developed" section. Also the "
https://github.com/druid-io/druid; link is wrong now, it should be the
incubator repo.

On Wed, Jul 18, 2018 at 1:57 PM Jonathan Wei  wrote:

> We neglected to submit podling reports for June and July, so I put together
> reports for those two months.
>
> I'm putting them here for internal review first, please comment if you have
> any feedback/changes.
>
> June:
> 
> Druid (As of June 01, 2018)
>
> Druid is a high-performance, column-oriented, distributed data store.
>
> Druid has been incubating since 2018-02-28.
>
> Three most important issues to address in the move towards graduation:
>
>  1. Move the source code and website to Apache infrastructure.
>  2. Plan and execute our first Apache release.
>  3. Expanding the community and adding more committers
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
>
> - None.
>
> How has the community developed since the last report?
>
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>   are still ongoing on https://github.com/druid-io/druid.
>
> How has the project developed since the last report?
>
> - SGA and ICLA status sorted out, ready to migrate source to Apache repo
> - Since the last report there have been 22 commits from 12 individuals.
> - We have conducted a vote to put out the 0.12.1 release. This release
> candidate is being done outside the Incubator.
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [X] Initial setup
>   [ ] Working towards first release
>   [ ] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
>
> When were the last committers or PPMC members elected?
>
> - Project is still functioning with the initial set of committers.
>
>
>
>
>
> July
> 
>
> Druid (As of July 01, 2018)
>
> Druid is a high-performance, column-oriented, distributed data store.
>
> Druid has been incubating since 2018-02-28.
>
> Three most important issues to address in the move towards graduation:
>
>  1. Move the source code and website to Apache infrastructure.
>  2. Plan and execute our first Apache release.
>  3. Expanding the community and adding more committers
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
>
> - None.
>
> How has the community developed since the last report?
>
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
>   are still ongoing on https://github.com/druid-io/druid.
>
> How has the project developed since the last report?
>
> - Source migration to Apache infrastructure is in progress (
> https://issues.apache.org/jira/browse/INFRA-16674)
> - Since the last report there have been 47 commits from 14 individuals.
> - We have released 0.12.1, a non-incubator release.
> - We are working on 0.12.2, a bug fix release. To get the bug fixes to
> users faster, this will be another non-incubator release.
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [ ] Initial setup
>   [X] Working towards first release
>   [ ] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> - Druid 0.12.1 on 2018-06-08 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
>
> When were the last committers or PPMC members elected?
>
> - Project is still functioning with the initial set of committers.
>


Re: Java script engine to be removed

2018-07-16 Thread Gian Merlino
IIRC we never moved to Nashorn, and are still using Rhino. Certainly, all
the imports in the relevant files are "org.mozilla".

On Sat, Jul 14, 2018 at 1:54 PM Charles Allen  wrote:

> http://openjdk.java.net/jeps/335
>
> https://bugs.openjdk.java.net/browse/JDK-8202786
>
> The javascript Nashorn engine is deprecated and slated to be removed in the
> next long term support release of Java.
>
> https://github.com/apache/incubator-druid/issues/5589 is the ticket for
> maintaining future java support in Druid.
>
> Not quite sure what the best way forward is. Should we revert back to rhino
> https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Rhino ?
>


Re: Travis Permissions

2018-07-16 Thread Gian Merlino
Fwiw, I do have permissions to restart individual builds, but I'm not sure
why.

On Sat, Jul 14, 2018 at 12:09 PM Roman Leventov  wrote:

> Seems that I no longer have permissions to restart builds in Travis. Could
> we do something about this, or we now need to close/open PRs to restart
> builds?
>


Re: Build failure on 0.13.SNAPSHOT

2018-07-23 Thread Gian Merlino
Interesting. Fwiw, I am using Maven 3.5.2 for building Druid and it has
been working for for me. I don't think I"m using any special Maven
overrides (at least, I don't see anything interesting in my ~/.m2 directory
or in my environment variables). It might have to do with how much memory
our machines have? I do most of my builds on a Mac with 16GB RAM. Maybe try
checking .travis.yml in the druid repo. It sets -Xmx3000m for mvn install
commands, which might be needed for more low memory environments.

$ mvn --version
Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d;
2017-10-18T00:58:13-07:00)
Maven home: /usr/local/Cellar/maven/3.5.2/libexec
Java version: 1.8.0_161, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"

On Mon, Jul 23, 2018 at 6:40 AM Dongjin Lee  wrote:

> Finally, it seems like I found the reason. It was a composition of several
> problems:
>
> - Druid should not be built with maven 3.5.x. With 3.5.2, Test suites like
> `GroupByQueryRunnerFailureTest` fails. After I switched into 3.3.9 which is
> built in the latest version of IntelliJ, those errors disappeared. It seems
> like maven 3.5.x is not stable yet - it applied a drastic change, and it is
> also why they skipped 3.4.x.
> - It seems like Druid requires some MaxDirectMemorySize configuration for
> some test suites. With some JVM parameter like `-XX:MaxDirectMemorySize=4g`
> some test suites were passed, but not all. I am now trying the other
> options with enlarged swap space.
>
> Question: How much MaxDirectMemorySize configuration are you using?
>
> Best,
> Dongjin
>
> On Sat, Jul 21, 2018 at 3:01 AM Jihoon Son  wrote:
>
> > Hi Dongjin,
> >
> > that is weird. It looks like the vm crashed because of out of memory
> while
> > testing.
> > It might be a real issue or not.
> > Have you set any memory configuration for your maven?
> >
> > Jihoon
> >
> > On Thu, Jul 19, 2018 at 7:09 PM Dongjin Lee  wrote:
> >
> > > Hi Jihoon,
> > >
> > > I ran `mvn clean package` following development/build
> > > <
> > >
> >
> https://github.com/apache/incubator-druid/blob/master/docs/content/development/build.md
> > > >
> > > .
> > >
> > > Dongjin
> > >
> > > On Fri, Jul 20, 2018 at 12:30 AM Jihoon Son 
> > wrote:
> > >
> > > > Hi Dongjin,
> > > >
> > > > what maven command did you run?
> > > >
> > > > Jihoon
> > > >
> > > > On Wed, Jul 18, 2018 at 10:38 PM Dongjin Lee 
> > wrote:
> > > >
> > > > > Hello. I am trying to build druid, but it fails. My environment is
> > like
> > > > the
> > > > > following:
> > > > >
> > > > > - CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz
> > > > > - RAM: 7704 MB
> > > > > - OS: ubuntu 18.04
> > > > > - JDK: openjdk version "1.8.0_171" (default configuration, with
> > > > MaxHeapSize
> > > > > = 1928 MB)
> > > > > - Branch: master (commit: cd8ea3d)
> > > > >
> > > > > The error message I got is:
> > > > >
> > > > > [INFO]
> > > > > >
> > > >
> > 
> > > > > > [INFO] Reactor Summary:
> > > > > > [INFO]
> > > > > > [INFO] io.druid:druid .
> > SUCCESS [
> > > > > > 50.258 s]
> > > > > > [INFO] java-util ..
> SUCCESS
> > > > > [03:57
> > > > > > min]
> > > > > > [INFO] druid-api ..
> > SUCCESS [
> > > > > > 22.694 s]
> > > > > > [INFO] druid-common ...
> > SUCCESS [
> > > > > > 14.083 s]
> > > > > > [INFO] druid-hll ..
> > SUCCESS [
> > > > > > 17.126 s]
> > > > > > [INFO] extendedset 
> > SUCCESS [
> > > > > > 10.856 s]
> > > > > >
> > > > > > *[INFO] druid-processing ...
> > FAILURE
> > > > > > [04:36 min]*[INFO] druid-aws-common
> > > ...
> > > > > > SKIPPED
> > > > > > [INFO] druid-server ...
> SKIPPED
> > > > > > [INFO] druid-examples .
> SKIPPED
> > > > > > ...
> > > > > > [INFO]
> > > > > >
> > > >
> > 
> > > > > > [INFO] BUILD FAILURE
> > > > > > [INFO]
> > > > > >
> > > >
> > 
> > > > > > [INFO] Total time: 10:29 min
> > > > > > [INFO] Finished at: 2018-07-19T13:23:31+09:00
> > > > > > [INFO] Final Memory: 88M/777M
> > > > > > [INFO]
> > > > > >
> > > >
> > 
> > > > > >
> > > > > > *[ERROR] Failed to execute goal
> > > > > > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test
> > > > (default-test)
> > > > > > on project druid-processing: Execution default-test of goal
> > > > > > 

Re: Release process for Maven artifacts?

2018-07-23 Thread Gian Merlino
Hi Joseph,

For official releases we just do mvn release:prepare release:perform. The
poms encode everything that has to happen, including sources jars, javadocs
jars, signing everything, and pushing it up to Maven Central. You might be
able to modify them to push to a different repo.

On Mon, Jul 23, 2018 at 5:06 PM Joseph Glanville  wrote:

> Hi,
>
> Is the release process for publishing Maven artifacts documented somewhere?
> We have been building tar archives with `mvn package` successfully but we
> would like to publish our own Maven artifacts also, including `-sources`
> JARs so we can add them as dependencies to projects outside of the Druid
> source tree.
>
> Joseph.
>


Re: Druid 0.12.2-rc1 vote

2018-07-24 Thread Gian Merlino
+1

On Thu, Jul 19, 2018 at 1:09 PM Jihoon Son  wrote:

> Hi all,
>
> we have no open issues and PRs for 0.12.2 (
> https://github.com/apache/incubator-druid/milestone/27). The 0.12.2 branch
> is already available and all PRs for 0.12.2 have merged into that branch.
>
> Let's vote on releasing RC1. Here is my +1.
>
> This is a non-ASF release.
>
> Best,
> Jihoon
>


Re: GitBox review comments

2018-07-24 Thread Gian Merlino
There is a github feature to make multiple comments as a single "review",
although from what I can see, gitbox splits those up into multiple emails
anyway, so it doesn't help. I have poked Infra again on our ticket:
https://issues.apache.org/jira/browse/INFRA-16674

On Tue, Jul 24, 2018 at 5:55 PM Julian Hyde  wrote:

> I know there is an open request with INFRA to route git review comments to
> a list other than dev. But until then, we are still getting a lot of
> messages on the list each day. A lot of them come in groups, as a reviewer
> makes multiple comments on a particular PR.
>
> I believe git has a feature called “review” where a reviewer can make
> multiple comments and they all appear in the same email. Is this feature
> supported in gitbox? If so, could reviewers consider using it, in order to
> reduce the email volume?
>
> Julian
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: Subscription Request

2018-07-19 Thread Gian Merlino
Hi Dongjin,

To subscribe, just send a mail to dev-subscr...@druid.apache.org.

On Wed, Jul 18, 2018 at 9:55 PM Dongjin Lee  wrote:

> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


Re: synchronization question about datasketches aggregator

2018-07-19 Thread Gian Merlino
Hi Will,

Check out also this thread for related discussion:

https://lists.apache.org/thread.html/9899aa790a7eb561ab66f47b35c8f66ffe695432719251351339521a@%3Cdev.druid.apache.org%3E

On Thu, Jul 19, 2018 at 11:21 AM Will Lauer  wrote:

> A colleague recently pointed out to me that all the sketch operations that
> take place in SketchAggregator (in the datasketches module) use a
> SychronizedUnion class that basically wraps a normal sketch Union and
> synchronizes all operations. From what I can tell with other aggregators in
> the Druid code base, there doesn't appear to be a need to synchronize. It
> looks like Aggregators are always processed from within a single thread. Is
> it reasonable to remove all the syncrhonizations from the SketchAggregator
> and avoid the performance hit that they impose at runtime?
>
> Will
>
> Will Lauer
> Senior Principal Architect
>
> Progress requires pain
>
> m: 508.561.6427
>
> o: 217.255.4262
>


Re: list polluted by gitbox messages

2018-07-19 Thread Gian Merlino
We're working with infra to redirect the notifications:
https://issues.apache.org/jira/browse/INFRA-16674

In the meantime, I have been using these filters to keep myself sane:
https://gist.github.com/gianm/0eb410915c02e3844e11235172894c62 (it's a gist
because the filters are partially based on content, and if paste them in
this message, they'll miscategorize it…)

On Thu, Jul 19, 2018 at 9:56 AM Prashant Deva 
wrote:

> seems like every bit of activity on gitbox is being posted to the dev
> mailing list. its impossible to see any real messages since all i see are
> gitbox mails.
>
> Prashant
>


Re: Build failure on 0.13.SNAPSHOT

2018-07-25 Thread Gian Merlino
Hi Dongjin,

Thanks for doing this research. I don't see building-druid.png attached to
your mail -- maybe the ASF mailing list ate it? Please do send in a pull
request to update the documentation. Fwiw I think 8GB should be enough
memory with the right settings, since our Travis CI build environment only
has about 7.5GB. We use the sudo-enabled trusty environment described here:
https://docs.travis-ci.com/user/reference/overview/. Our Travis config is
here: https://github.com/apache/incubator-druid/blob/master/.travis.yml

Separately it would also be interesting to see what this memory is being
used for during the build. I wonder if it's unavoidable or if some of the
memory use is 'silly' and can be reduced.

On Tue, Jul 24, 2018 at 9:46 PM Dongjin Lee  wrote:

> After some experiments, I figured out the following:
>
> 1. Druid uses above 8gb of memory for testing. (building-druid.png)
> 2. With 8gb(physical)+4gb(swap) of memory, the test succeeds regardless of
> maven version (3.3.9, 3.5.2, 3.5.4) or MAVEN_OPTS. However, with
> 8gb(physical)+2gb(swap) of memory[^1], some tests failed. The list of
> failing tests differs between maven 3.3.9 and 3.5.2.
>
> In short, retaining sufficient memory solved the problem - *It seems like
> 12gb of memory is a recommended setting for building druid.* (I guess
> lots of you are working with the MacBook Pro with 16gb RAM, right? In that
> case, you must not have encountered this problem.)
>
> If you are okay, may I update the building documentation for the newbies
> like me?
>
> Thanks,
> Dongjin
>
> +1. While building Druid, I found another problem. But this issue should
> be discussed in another thread.
>
> [^1]: You know, the other processes also occupy the memory.
>
>
> On Tue, Jul 24, 2018 at 3:07 AM Jihoon Son  wrote:
>
>> I'm also using Maven 3.5.2 and not using any special configurations for
>> Maven, but I have never seen that error too.
>> Most of our Travis jobs have been working with only 512 MB of direct
>> memory. Only the 'strict compilation' Travis job requires 3 GB of memory.
>>
>> I think it's worthwhile to look into this more. Maybe we somehow use more
>> memory when we run all tests by 'mvn install'. Maybe this relates to the
>> frequent transient failures of 'processing module test', one of our Travis
>> jobs.
>>
>> Jihoon
>>
>> On Mon, Jul 23, 2018 at 9:32 AM Gian Merlino  wrote:
>>
>> > Interesting. Fwiw, I am using Maven 3.5.2 for building Druid and it has
>> > been working for for me. I don't think I"m using any special Maven
>> > overrides (at least, I don't see anything interesting in my ~/.m2
>> directory
>> > or in my environment variables). It might have to do with how much
>> memory
>> > our machines have? I do most of my builds on a Mac with 16GB RAM. Maybe
>> try
>> > checking .travis.yml in the druid repo. It sets -Xmx3000m for mvn
>> install
>> > commands, which might be needed for more low memory environments.
>> >
>> > $ mvn --version
>> > Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d;
>> > 2017-10-18T00:58:13-07:00)
>> > Maven home: /usr/local/Cellar/maven/3.5.2/libexec
>> > Java version: 1.8.0_161, vendor: Oracle Corporation
>> > Java home:
>> > /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre
>> > Default locale: en_US, platform encoding: UTF-8
>> > OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"
>> >
>> > On Mon, Jul 23, 2018 at 6:40 AM Dongjin Lee  wrote:
>> >
>> > > Finally, it seems like I found the reason. It was a composition of
>> > several
>> > > problems:
>> > >
>> > > - Druid should not be built with maven 3.5.x. With 3.5.2, Test suites
>> > like
>> > > `GroupByQueryRunnerFailureTest` fails. After I switched into 3.3.9
>> which
>> > is
>> > > built in the latest version of IntelliJ, those errors disappeared. It
>> > seems
>> > > like maven 3.5.x is not stable yet - it applied a drastic change, and
>> it
>> > is
>> > > also why they skipped 3.4.x.
>> > > - It seems like Druid requires some MaxDirectMemorySize configuration
>> for
>> > > some test suites. With some JVM parameter like
>> > `-XX:MaxDirectMemorySize=4g`
>> > > some test suites were passed, but not all. I am now trying the other
>> > > options with enlarged swap space.
>> > >
>> > > Question: How much MaxDirectMemorySize configuration are you using?
>>

Re: Release process for Maven artifacts?

2018-07-25 Thread Gian Merlino
Ah, I forgot to mention the parent pom. Glad you spotted it.

On Tue, Jul 24, 2018 at 10:32 PM Joseph Glanville  wrote:

> Hi Gian,
>
> I was able to publish our own artifacts by creating a new oss-parent pom
> pointing to our Maven repository and updating the Druid root pom to point
> to it.
> It was non-obvious where the release plugin was configured until I spotted
> the parent reference.
>
> Joseph.
>
> On Tue, Jul 24, 2018 at 9:09 AM, Gian Merlino  wrote:
>
> > Hi Joseph,
> >
> > For official releases we just do mvn release:prepare release:perform. The
> > poms encode everything that has to happen, including sources jars,
> javadocs
> > jars, signing everything, and pushing it up to Maven Central. You might
> be
> > able to modify them to push to a different repo.
> >
> > On Mon, Jul 23, 2018 at 5:06 PM Joseph Glanville 
> > wrote:
> >
> > > Hi,
> > >
> > > Is the release process for publishing Maven artifacts documented
> > somewhere?
> > > We have been building tar archives with `mvn package` successfully but
> we
> > > would like to publish our own Maven artifacts also, including
> `-sources`
> > > JARs so we can add them as dependencies to projects outside of the
> Druid
> > > source tree.
> > >
> > > Joseph.
> > >
> >
>


Re: Flushing Travis Caches

2018-07-25 Thread Gian Merlino
No objection, I did it at one point in the past too when it seemed to have
got corrupted (weird, inexplicable errors unpacking jars that got fixed
when the cache was cleared).

On Wed, Jul 25, 2018 at 10:10 AM Charles Allen
 wrote:

> The cache for one of my PRs got corrupt on travis and I had to clear it. I
> noticed the resulting cache is about 90% smaller! My question is as
> follows:
>
> Any objection to deleting (purging) the travis caches on occasion? I'm
> specifically interested in master
>
> Cheers,
> Charles Allen
>


Re: Towards 0.13 (Apache release)

2018-08-30 Thread Gian Merlino
That PR is merged now! If anyone here still has outstanding PRs that are
now in conflict with master, try running this before merging master, it
really helps git out.

  git config --local merge.renameLimit 5000

My experience was that even a patch with a few dozen changed files merged
pretty cleanly, after setting this config. I just had a few conflicts to
resolve in imports.

On Wed, Aug 29, 2018 at 4:09 PM Gian Merlino  wrote:

> I just raised https://github.com/apache/incubator-druid/pull/6266. I
> think for sanity's sake, I would really appreciate it if we got this one
> merged before merging any other PRs. (It will conflict with 100% of other
> PRs)
>
> On Wed, Aug 29, 2018 at 9:34 AM Gian Merlino  wrote:
>
>> Hi everyone,
>>
>> As we continue towards 0.13 I started looking into the "great renaming"
>> (of all packages from io.druid -> org.apache.druid) and am getting a PR
>> ready. I know Slim is working on
>> https://github.com/apache/incubator-druid/pull/6215 too (automated
>> license checking and some header fixups).
>>
>> Other than these Apache related items, we have 26 open issues/PRs in the
>> 0.13.0 milestone: https://github.com/apache/incubator-druid/milestone/25.
>> Is this everything we want to include? Is anything there we should bump to
>> the next release? Is anything _not_ there that needs to be added?
>>
>> Let's figure out when we can target a code freeze -- the start of the RC
>> train for our first Apache release!!
>>
>


Re: Druid 0.12.3-rc1 available

2018-09-05 Thread Gian Merlino
Thanks Jon!!

How does everyone feel about starting the vote on this next Monday?

On Tue, Sep 4, 2018 at 12:41 PM Jonathan Wei  wrote:

> We're happy to announce our next release candidate, Druid 0.12.3-rc1!
>
> Druid 0.12.3 is a non-ASF release. It contains stability improvements and
> bug fixes from 6 contributors. Major improvements include a more stable
> Kafka indexing service and several query bug fixes
>
> Everyone in the community is invited to help out with the upcoming release
> by downloading this candidate and evaluating it.
>
> Draft release notes are at:
> https://github.com/druid-io/druid/issues/6288
>
> Documentation for this release candidate is at:
> http://druid.io/docs/0.12.3-rc1/
>
> You can download the release candidate here:
> http://druid.io/downloads.html
>
> Please file GitHub issues if you find any problems:
> https://github.com/druid-io/druid/issues/new
>
> Thanks everyone who contributed!
>


Re: Towards 0.13 (Apache release)

2018-08-29 Thread Gian Merlino
I just raised https://github.com/apache/incubator-druid/pull/6266. I think
for sanity's sake, I would really appreciate it if we got this one merged
before merging any other PRs. (It will conflict with 100% of other PRs)

On Wed, Aug 29, 2018 at 9:34 AM Gian Merlino  wrote:

> Hi everyone,
>
> As we continue towards 0.13 I started looking into the "great renaming"
> (of all packages from io.druid -> org.apache.druid) and am getting a PR
> ready. I know Slim is working on
> https://github.com/apache/incubator-druid/pull/6215 too (automated
> license checking and some header fixups).
>
> Other than these Apache related items, we have 26 open issues/PRs in the
> 0.13.0 milestone: https://github.com/apache/incubator-druid/milestone/25.
> Is this everything we want to include? Is anything there we should bump to
> the next release? Is anything _not_ there that needs to be added?
>
> Let's figure out when we can target a code freeze -- the start of the RC
> train for our first Apache release!!
>


Druid August 2018 Bay Area meetup

2018-07-10 Thread Gian Merlino
Please join us for our next Druid meetup at Lyft, which you can RSVP to
here: https://www.meetup.com/druidio/events/252515792/

Speakers will be sharing real-world experience with Druid and discussing
the upcoming Druid roadmap.

Hope to see everyone there!


Re: Question about sketches aggregation in druid

2018-07-10 Thread Gian Merlino
Hi Eshcar,

To my knowledge, in the Druid Aggregator and BufferAggregator interfaces,
the main place where concurrency happens is that "aggregate" and "get" may
be called simultaneously during realtime ingestion. So if there would be a
benefit from improving concurrency it would probably end up in that area.

On Tue, Jul 10, 2018 at 2:10 AM Eshcar Hillel 
wrote:

> Hi All,
> My name is Eshcar Hillel from Oath research. I'm currently working with
> Lee Rhodes on committing a new concurrent implementation of the theta
> sketch to the sketches-core library.I was wondering whether this
> implementation can help boost the union operation that is applied to
> multiple sketches at query time in druid.From what I see in the code the
> sketch aggregator uses the SynchronizedUnion implementation, which
> basically uses a lock at every single access (update/read) of the union
> operation. We believe a thread-safe implementation of the union operation
> can help decrease the inherent overhead of the lock.
> I will be happy to join the meeting today and briefly discuss this option.
> Thanks,Eshcar
>
>
>


Re: TeamCity Inspections CI broke

2018-07-10 Thread Gian Merlino
Any idea what has to be done to fix it? I am not too familiar with how the
TeamCity integration works. Is it something we can fix ourselves or
something Infra has to do?

On Tue, Jul 10, 2018 at 7:29 AM Roman Leventov  wrote:

> Since the repo was moved to apache org, the integration failed, the error
> message is:
>
> Commit Status Publisher error. GitHub status: 'Success'. Publisher:
> githubStatusPublisher(https://api.github.com). Failed to complete request
> to GitHub. Please check if the error is not returned by a proxy or caused
> by the lack of permissions. Status: HTTP/1.1 404 Not Found
>
> You could find it here:
>
> https://teamcity.jetbrains.com/project.html?projectId=OpenSourceProjects_Druid=projectOverview
>


Re: Druid repo migration plan

2018-07-06 Thread Gian Merlino
I asked in our original Infra ticket to adjust the mailing lists:
https://issues.apache.org/jira/browse/INFRA-16674?focusedCommentId=16535589=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16535589

Unfortunately setting up filters proved challenging in Gmail (gitbox sets a
custom header that could be used in theory, but Gmail doesn't support
filtering on arbitrary headers). I'm doing something like this now, in case
it proves useful for anyone else:
https://gist.github.com/gianm/0eb410915c02e3844e11235172894c62 (it's a gist
because the filters are partially based on content, and if paste them in
this message, they'll catch it as a false positive gitbox mail…)

On Fri, Jul 6, 2018 at 1:31 PM Gian Merlino  wrote:

> IMO, GitHub has good tools already for subscribing to repos,
> notifications, and issues. I would expect Druid contributors to use those
> tools and _not_ to join git...@druid.apache.org. I'd imagine the list
> would only exist for archival purposes (we want a copy of every discussion
> to be stored somehow on ASF infra).
>
> On Fri, Jul 6, 2018 at 1:27 PM Julian Hyde  wrote:
>
>> I see there are notifications not just for commits but for every comment
>> on every issue. That’s going to be overwhelming. I can’t imagine anyone
>> wanting to subscribe to the gitbox list at the current volume.
>>
>> How about only sending comments to people who have watched a particular
>> case?
>>
>> Julian
>>
>>
>> > On Jul 6, 2018, at 1:08 PM, Julian Hyde  wrote:
>> >
>> > I like the idea of a separate list for commits. Keeps the noise down on
>> the dev list.
>> >
>> > Other projects (e.g. calcite) would name that list “commits”, i.e.
>> comm...@druid.apache.org <mailto:comm...@druid.apache.org>. If you were
>> to add other repos (e.g. a subversion repo for web site) you could send its
>> commits there too.
>> >
>> >> On Jul 6, 2018, at 12:24 PM, Samarth Jain > <mailto:samarth.j...@gmail.com>> wrote:
>> >>
>> >> +1 to sending to git...@druid.apache.org > git...@druid.apache.org>
>> >>
>> >> On Fri, Jul 6, 2018 at 12:17 PM, Gian Merlino > <mailto:g...@apache.org>> wrote:
>> >>
>> >>> The repo move has happened: https://github.com/apache/incubator-druid
>> <https://github.com/apache/incubator-druid>
>> >>>
>> >>> My understanding is that now we can and should start prepping license,
>> >>> notice, etc for a release. And also making sure that our integrations
>> still
>> >>> work right (Travis / TeamCity).
>> >>>
>> >>> Also: does anyone else find the gitbox mails sent to
>> dev@druid.apache.org <mailto:dev@druid.apache.org>
>> >>> annoying? I'm planning to set up a mail filter to put them into
>> another
>> >>> folder (they're redundant to notifications I already get from
>> GitHub). If
>> >>> people generally feel the same way we could ask Infra to move them to
>> a
>> >>> separate mailing list, maybe git...@druid.apache.org > git...@druid.apache.org>.
>> >>>
>> >>> On Tue, Jul 3, 2018 at 12:57 PM Gian Merlino > <mailto:g...@apache.org>> wrote:
>> >>>
>> >>>> Here is the ticket, btw: https://issues.apache.org/ <
>> https://issues.apache.org/>
>> >>> jira/browse/INFRA-16674.
>> >>>> Repo move should be happening real soon now!
>> >>>>
>> >>>> On Mon, Jul 2, 2018 at 11:55 PM Gian Merlino > <mailto:g...@apache.org>> wrote:
>> >>>>
>> >>>>> Our infra ticket is progressing along and it looks like we're just
>> about
>> >>>>> ready to pull the trigger on moving the repo. So, committers, please
>> >>> make
>> >>>>> sure your ASF gitbox stuff is working:
>> https://gitbox.apache.org/setup/ <https://gitbox.apache.org/setup/>
>> >>>>>
>> >>>>> On Fri, Jun 22, 2018 at 1:22 PM Gian Merlino > <mailto:g...@apache.org>> wrote:
>> >>>>>
>> >>>>>> Thanks for the tips, Max!! I think we are, hopefully, doing okay on
>> >>> some
>> >>>>>> of these. My thoughts inline.
>> >>>>>>
>> >>>>>>> Since you need elevated rights on both
>> >>>>>> orgs to move the repo (say airbnb and apache) and that both parties
>> >>>>>

Re: Druid repo migration plan

2018-07-06 Thread Gian Merlino
IMO, GitHub has good tools already for subscribing to repos, notifications,
and issues. I would expect Druid contributors to use those tools and _not_
to join git...@druid.apache.org. I'd imagine the list would only exist for
archival purposes (we want a copy of every discussion to be stored somehow
on ASF infra).

On Fri, Jul 6, 2018 at 1:27 PM Julian Hyde  wrote:

> I see there are notifications not just for commits but for every comment
> on every issue. That’s going to be overwhelming. I can’t imagine anyone
> wanting to subscribe to the gitbox list at the current volume.
>
> How about only sending comments to people who have watched a particular
> case?
>
> Julian
>
>
> > On Jul 6, 2018, at 1:08 PM, Julian Hyde  wrote:
> >
> > I like the idea of a separate list for commits. Keeps the noise down on
> the dev list.
> >
> > Other projects (e.g. calcite) would name that list “commits”, i.e.
> comm...@druid.apache.org <mailto:comm...@druid.apache.org>. If you were
> to add other repos (e.g. a subversion repo for web site) you could send its
> commits there too.
> >
> >> On Jul 6, 2018, at 12:24 PM, Samarth Jain  <mailto:samarth.j...@gmail.com>> wrote:
> >>
> >> +1 to sending to git...@druid.apache.org  git...@druid.apache.org>
> >>
> >> On Fri, Jul 6, 2018 at 12:17 PM, Gian Merlino  g...@apache.org>> wrote:
> >>
> >>> The repo move has happened: https://github.com/apache/incubator-druid
> <https://github.com/apache/incubator-druid>
> >>>
> >>> My understanding is that now we can and should start prepping license,
> >>> notice, etc for a release. And also making sure that our integrations
> still
> >>> work right (Travis / TeamCity).
> >>>
> >>> Also: does anyone else find the gitbox mails sent to
> dev@druid.apache.org <mailto:dev@druid.apache.org>
> >>> annoying? I'm planning to set up a mail filter to put them into another
> >>> folder (they're redundant to notifications I already get from GitHub).
> If
> >>> people generally feel the same way we could ask Infra to move them to a
> >>> separate mailing list, maybe git...@druid.apache.org  git...@druid.apache.org>.
> >>>
> >>> On Tue, Jul 3, 2018 at 12:57 PM Gian Merlino  g...@apache.org>> wrote:
> >>>
> >>>> Here is the ticket, btw: https://issues.apache.org/ <
> https://issues.apache.org/>
> >>> jira/browse/INFRA-16674.
> >>>> Repo move should be happening real soon now!
> >>>>
> >>>> On Mon, Jul 2, 2018 at 11:55 PM Gian Merlino  <mailto:g...@apache.org>> wrote:
> >>>>
> >>>>> Our infra ticket is progressing along and it looks like we're just
> about
> >>>>> ready to pull the trigger on moving the repo. So, committers, please
> >>> make
> >>>>> sure your ASF gitbox stuff is working:
> https://gitbox.apache.org/setup/ <https://gitbox.apache.org/setup/>
> >>>>>
> >>>>> On Fri, Jun 22, 2018 at 1:22 PM Gian Merlino  <mailto:g...@apache.org>> wrote:
> >>>>>
> >>>>>> Thanks for the tips, Max!! I think we are, hopefully, doing okay on
> >>> some
> >>>>>> of these. My thoughts inline.
> >>>>>>
> >>>>>>> Since you need elevated rights on both
> >>>>>> orgs to move the repo (say airbnb and apache) and that both parties
> >>>>>> aren't
> >>>>>> ok with that, it's typical to use a middleman org like
> `apacheinfra`.
> >>>>>>
> >>>>>> Luckily, our org is limited to just Druid stuff (
> >>>>>> https://github.com/druid-io <https://github.com/druid-io>) so we
> should be OK to add Apache Infra
> >>>>>> people with elevated rights.
> >>>>>>
> >>>>>>> * make merge hook checks optional, so that if coverage, travis, or
> >>> code
> >>>>>> quality checks do not prevent merging, since it's likely those check
> >>>>>> won't
> >>>>>> trigger and as a non-admin you won't be able to force-merge
> >>>>>>
> >>>>>> We have a couple (Travis and TeamCity) and they're already optional.
> >>>>>>
> >>>>>>> * consider unprotecting protected branches so that you can push to
> >>>>>> master
> >>>>>>

Re: Regarding becoming a contributor

2018-07-12 Thread Gian Merlino
Hi Himanshu,

Awesome that you are interested in helping out! We have a community page
here that describes how you can get started: http://druid.io/community/

The basics are:

1) Subscribe to the dev list.
2) If you are using Druid today and have an itch, scratching that itch is
the best way to get started -- ask around to see if there's a way to
implement a feature you want or fix a bug that's bugging you.
3) If you want some ideas, check out open issues, especially the ones
labeled "easy" (they tend to make good starter issues), like
https://github.com/apache/incubator-druid/issues/5869 or
https://github.com/apache/incubator-druid/issues/5644.

Happy Druiding!

On Thu, Jul 12, 2018 at 9:56 AM Himanshu Pandey 
wrote:

> Hey,
>
> I recently came across Druid and would like to contribute to it.
>
> Where/how I can start. I has checking list of open issue but looks most of
> them are already worked /fixed.
>
> Thanks!
>
> *Thanks & Regards,*
> *Himanshu Pandey*
> *Cell: +1 (408) 644 - 8765*
>


Re: Druid 0.12.2-rc1 vote

2018-07-11 Thread Gian Merlino
Well, it's never good if a WTH?! message actually gets logged. They are
usually meant to be things that should "never" happen. I am ok with holding
off 0.12.2-rc1 until this fix is in.

On Wed, Jul 11, 2018 at 1:04 PM Jihoon Son  wrote:

> Thanks everyone for voting.
>
> Unfortunately, I found another bug in Kafka indexing service (
> https://github.com/apache/incubator-druid/issues/5992). I think it's worth
> to include 0.12.2.
> I'm currently working on that issue and can probably finish at least by
> this week.
>
> Can we add it to 0.12.2 and vote again once a patch to fix is merged?
>
> Jihoon
>
> On Wed, Jul 11, 2018 at 10:02 AM Jonathan Wei  wrote:
>
> > +1
> >
> > On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino  wrote:
> >
> > > +1 from me too!
> > >
> > > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen 
> > wrote:
> > >
> > > > That is very helpful, thank you!
> > > >
> > > > +1 for continuing with 0.12.2-RC1
> > > >
> > > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie 
> > > wrote:
> > > >
> > > > > Heya, sorry for the delay (and missing the sync, i'll try to get
> > better
> > > > > about showing up). I've fixed a handful of coordinator bugs post
> > 0.12.0
> > > > > (and
> > > > > not backported to 0.12.1), some of these issues go far back, some
> > back
> > > to
> > > > > when segment assignment priority for different tiers of historicals
> > was
> > > > > introduced, some are just some oddities on the behavior of the
> > balancer
> > > > > that I am unsure when were introduced. This is the complete list of
> > > fixes
> > > > > that are currently in 0.12.2 afaik, with a small description (see
> PRs
> > > and
> > > > > associated issues for more details)
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue
> > > that
> > > > > movement did not drop the segment from the server the segment was
> > being
> > > > > moved from (this one goes wy back, to batch segment
> > announcements)
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5529 changed
> behavior
> > > of
> > > > > drop to use the balancer to choose where to drop segments from,
> based
> > > on
> > > > > behavior observed caused by the issue of 5528
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue
> > > where
> > > > > primary assignment during load rule processing would assign an
> > > > unavailable
> > > > > segment to every server with capacity until at least 1 historical
> had
> > > the
> > > > > segment (and drop it from all the others if they all loaded at the
> > same
> > > > > time), choking load queues from doing useful things
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/ fixed a way
> for
> > > http
> > > > > based coordinator to get stuck loading or dropping segments and a
> > > > companion
> > > > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > > > https://github.com/apache/incubator-druid/pull/5591
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5888 makes
> balancing
> > > > honor
> > > > > a
> > > > > load rule max load queue depth setting to help prevent movement
> from
> > > > > starving loading
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5928 doesn't really
> > fix
> > > > > anything, just does an early return to avoid doing pointless work
> > > > >
> > > > > Additionally, there are a couple of pairs of PRs that are not
> > currently
> > > > in
> > > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > > > https://github.com/apache/incubator-druid/pull/5929 and their
> > > respective
> > > > > fixes which have yet to be merged, but have been performing well on
> > our
> > > > > test cluster, https://github.com/apache/incubator-druid/pull/5987
> > and
> > > > > https://github.com/apache/incubator-druid/pull/5988. One of them
> > makes
> > > > > balanc

Re: [druid-dev] Apache migration logistics

2018-03-12 Thread Gian Merlino
Committers: please,

1) If you don't have an apache id already, fill out an ICLA:
https://www.apache.org/dev/new-committers-guide.html#
guide-for-new-committers and then post here and hopefully someone can
figure out how to get you an id?

2) When you have an id, post it here if it's not in
http://incubator.apache.org/projects/druid.html so someone can figure out
how to add you to that, and then also try to sign up to
private-subscr...@druid.apache.org (+ dev-subscr...@druid.apache.org which
you should be on already). If you can't, then also post here, and hopefully
someone can figure _that_ out.

Gian

On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté  wrote:

> This thread is already going to both lists, and it looks like responses
> automatically go to both. Would be good to check what happens if we
> subscribe dev@ to the google group. If responding from the apache list
> doesn't automatically add the google group as well, it will be hard to keep
> the group useful.
>
> Agree with Julian a cutoff is necessary anyway, since the google group
> inherently becomes less useful over time, as some information only ends up
> in the apache list.
>
> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa 
> wrote:
>
>> We can register dev@druid.apache.org and us...@druid.apache.org as a
>> user in druid user groups so that going forward any mails that are sent to
>> druid google groups are also received on the apache lists and is on the
>> record. This would be to bridge the gap during the migration only.
>>
>> @Julian, I go ahead and try setting this up, If this seems reasonable ?
>>
>> On Sat, 10 Mar 2018 at 00:09 Julian Hyde  wrote:
>>
>>> I don’t know. I don’t think it’s easy.
>>>
>>>
>>> On Mar 9, 2018, at 7:31 AM, Roman Leventov >> com> wrote:
>>>
>>> Could archives of druid-dev and druid-users mailing lists be transferred
>>> to the new lists?
>>>
>>> On Thu, Mar 8, 2018 at 8:48 AM, Julian Hyde  wrote:
>>>
 I’m working on it. It turns that I don’t have sufficient karma to
 create a git repo, so I’ve put in a request on the incubator list.


 On Mar 6, 2018, at 10:12 AM, Xavier Léauté  wrote:

 Julian, it looks like you or one of the mentors has to request the
 source code repos. Could you request a gitbox enabled repo?

 Based on https://incubator.apache.org/guides/transitioning_asf.html,
 for the initial migration, we need to involve infra to import the initial
 git history and grant them admin rights to the github repo.

 Charles, it also sound like we won't be able to do any code migration
 util legal signs off on the software grant, could you drive that?

 On Mon, Mar 5, 2018 at 12:52 PM Julian Hyde 
 wrote:

> The dev, users and private mailing lists now exist. You can see the
> archives:
>
> * https://lists.apache.org/list.html?dev@druid.apache.org
> * https://lists.apache.org/list.html?us...@druid.apache.org
> * https://lists.apache.org/list.html?priv...@druid.apache.org
>
> To see the last of these, you need to log in.
>
> There will also be archives at https://mail-archives.
> apache.org/mod_mbox/druid-dev/ etc. (you might need to wait a few
> minutes for the archiver to catch up).
>
> If you are an initial committer or mentor, you are a member of the
> Druid PPMC and you must be on both dev and private lists now. Send a
> message to dev-subscr...@druid.apache.org and private-subscribe@
> geode.apache.org.
>
> Everyone else is welcome to join dev and/or users.
>
> Julian
>
> On Mar 5, 2018, at 11:58 AM, Nishant Bangarwa <
> nishant.mon...@gmail.com> wrote:
>
> Apache Incubator Superset is another example which uses github issues
> - https://github.com/apache/incubator-superset/issues
> For Superset it works as all the github issue interactions are
> captured in ASF owned mailing list via Gitbox Integration.
> See - https://lists.apache.org/list.html?d...@superset.apache.org
>
> For Druid, If everyone agrees we can also choose to capture
> interactions on github issues at an Apache Owned mailing list e.g
> iss...@druid.apache.org and continue to use github issues.
>
> @Jihoon, Thanks for the Airflow migration link, super helpful.
>
>
>
> On Tue, 6 Mar 2018 at 00:44 Jihoon Son  wrote:
>
>> Gian,
>>
>> there was a discussion for using a third-party issue tracker (
>> https://issues.apache.org/jira/browse/LEGAL-249). I think the point
>> is
>>
>> > Okay, it looks like the requirement is just to capture the intent
>> to contribute in ASF-owned infrastructure. That means that the automated
>> process that adds PR information to a JIRA issue or sends it to a 

Re: [druid-dev] Apache migration logistics

2018-03-13 Thread Gian Merlino
I think rather than subscribing the apache dev list to the google group,
it'd be better to do a clean break and disable posting on the google group.
The autoreply feature seems to need a business account, which we don't
have, but we could turn on "Moderate all messages to the group" and combine
that with a "Rejected author notification" message. I think that would be
pretty close.

Gian

On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté  wrote:

> This thread is already going to both lists, and it looks like responses
> automatically go to both. Would be good to check what happens if we
> subscribe dev@ to the google group. If responding from the apache list
> doesn't automatically add the google group as well, it will be hard to keep
> the group useful.
>
> Agree with Julian a cutoff is necessary anyway, since the google group
> inherently becomes less useful over time, as some information only ends up
> in the apache list.
>
> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa 
> wrote:
>
>> We can register dev@druid.apache.org and us...@druid.apache.org as a
>> user in druid user groups so that going forward any mails that are sent to
>> druid google groups are also received on the apache lists and is on the
>> record. This would be to bridge the gap during the migration only.
>>
>> @Julian, I go ahead and try setting this up, If this seems reasonable ?
>>
>> On Sat, 10 Mar 2018 at 00:09 Julian Hyde  wrote:
>>
>>> I don’t know. I don’t think it’s easy.
>>>
>>>
>>> On Mar 9, 2018, at 7:31 AM, Roman Leventov >> com> wrote:
>>>
>>> Could archives of druid-dev and druid-users mailing lists be transferred
>>> to the new lists?
>>>
>>> On Thu, Mar 8, 2018 at 8:48 AM, Julian Hyde  wrote:
>>>
 I’m working on it. It turns that I don’t have sufficient karma to
 create a git repo, so I’ve put in a request on the incubator list.


 On Mar 6, 2018, at 10:12 AM, Xavier Léauté  wrote:

 Julian, it looks like you or one of the mentors has to request the
 source code repos. Could you request a gitbox enabled repo?

 Based on https://incubator.apache.org/guides/transitioning_asf.html,
 for the initial migration, we need to involve infra to import the initial
 git history and grant them admin rights to the github repo.

 Charles, it also sound like we won't be able to do any code migration
 util legal signs off on the software grant, could you drive that?

 On Mon, Mar 5, 2018 at 12:52 PM Julian Hyde 
 wrote:

> The dev, users and private mailing lists now exist. You can see the
> archives:
>
> * https://lists.apache.org/list.html?dev@druid.apache.org
> * https://lists.apache.org/list.html?us...@druid.apache.org
> * https://lists.apache.org/list.html?priv...@druid.apache.org
>
> To see the last of these, you need to log in.
>
> There will also be archives at https://mail-archives.
> apache.org/mod_mbox/druid-dev/ etc. (you might need to wait a few
> minutes for the archiver to catch up).
>
> If you are an initial committer or mentor, you are a member of the
> Druid PPMC and you must be on both dev and private lists now. Send a
> message to dev-subscr...@druid.apache.org and private-subscribe@
> geode.apache.org.
>
> Everyone else is welcome to join dev and/or users.
>
> Julian
>
> On Mar 5, 2018, at 11:58 AM, Nishant Bangarwa <
> nishant.mon...@gmail.com> wrote:
>
> Apache Incubator Superset is another example which uses github issues
> - https://github.com/apache/incubator-superset/issues
> For Superset it works as all the github issue interactions are
> captured in ASF owned mailing list via Gitbox Integration.
> See - https://lists.apache.org/list.html?d...@superset.apache.org
>
> For Druid, If everyone agrees we can also choose to capture
> interactions on github issues at an Apache Owned mailing list e.g
> iss...@druid.apache.org and continue to use github issues.
>
> @Jihoon, Thanks for the Airflow migration link, super helpful.
>
>
>
> On Tue, 6 Mar 2018 at 00:44 Jihoon Son  wrote:
>
>> Gian,
>>
>> there was a discussion for using a third-party issue tracker (
>> https://issues.apache.org/jira/browse/LEGAL-249). I think the point
>> is
>>
>> > Okay, it looks like the requirement is just to capture the intent
>> to contribute in ASF-owned infrastructure. That means that the automated
>> process that adds PR information to a JIRA issue or sends it to a mailing
>> list is fine.
>>
>> In short, it looks like allowed (Apache Fluo is using Github issue
>> tracker (https://github.com/apache/fluo/issues)), but it should be
>> captured by another issue 

Re: Podling report - April 2018

2018-04-06 Thread Gian Merlino
https://community.apache.org/newcommitter.html says that new accounts are
created by PMC chairs. In our case that would probably be the Incubator
PMC? So I'd try emailing gene...@incubator.apache.org and seeing if someone
there can help.

On Thu, Apr 5, 2018 at 12:16 AM, Roman Leventov <leventov...@gmail.com>
wrote:

> No I don't have an ID, just signed for making a PR to some project.
>
> On Thu, 5 Apr 2018, 02:49 Parag Jain, <paragjai...@gmail.com> wrote:
>
> > I've signed the ICLA today. Will update the page once I get the Apache
> id.
> >
> > On Wed, Apr 4, 2018, 6:16 PM Xavier Léauté <xav...@confluent.io> wrote:
> >
> > > Roman, in that case you should already have an Apache id, can you add
> it
> > to
> > > the page?
> > > On Wed, Apr 4, 2018 at 16:00 Roman Leventov <leventov...@gmail.com>
> > wrote:
> > >
> > > > I've signed an ICLA for a different project before, doesn't it count?
> > > >
> > > > On Thu, 5 Apr 2018, 01:39 Gian Merlino, <gianmerl...@gmail.com>
> wrote:
> > > >
> > > > > Ah okay cool, thanks
> > > > >
> > > > > On Wed, Apr 4, 2018 at 11:27 AM, Xavier Léauté <
> xav...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > I was just looking at http://people.apache.org/
> > > > > >
> > > > > > On Wed, Apr 4, 2018 at 7:59 AM Gian Merlino <
> gianmerl...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I saved the report to the incubator wiki at
> > > https://wiki.apache.org/
> > > > > > > incubator/April2018 <https://wiki.apache.org/
> incubator/April2018
> > >.
> > > > > > >
> > > > > > > What's the directory you're looking at for ICLAs?
> > > > > > >
> > > > > > > On Tue, Apr 3, 2018 at 4:43 PM, Xavier Léauté <
> > xav...@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > I think I've seen ICLAs go through for most of the
> committers.
> > > > > > > > The only ones I can't find in the directory are Parag and
> > Roman.
> > > > > > > >
> > > > > > > > Everyone that hasn't updated the incubator page at
> > > > > > > > http://incubator.apache.org/projects/druid.html with their
> > > apache
> > > > > id,
> > > > > > > > would
> > > > > > > > you mind doing so?
> > > > > > > >
> > > > > > > > On Tue, Apr 3, 2018 at 4:31 PM Gian Merlino <
> > > gianmerl...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Druids,
> > > > > > > > >
> > > > > > > > > Here is a draft of our podling report for April 2018. Let
> me
> > > know
> > > > > > what
> > > > > > > > you
> > > > > > > > > think. It's due tomorrow EOD so I will post it tomorrow.
> > > > > > > > >
> > > > > > > > > 
> > > > > > > > >
> > > > > > > > > Three most important issues to address in the move towards
> > > > > > graduation:
> > > > > > > > >
> > > > > > > > >  1. Complete SGA for current sources and ICLAs for current
> > > > > > committers.
> > > > > > > > >  2. Move the source code and website to Apache
> > infrastructure.
> > > > > > > > >  3. Plan and execute our first Apache release.
> > > > > > > > >
> > > > > > > > > Any issues that the Incubator PMC (IPMC) or ASF Board
> > wish/need
> > > > to
> > > > > be
> > > > > > > > > aware of?
> > > > > > > > >
> > > > > > > > > - None.
> > > > > > > > >
> > > > > > > > > How has the community developed since the last report?
> > > > > > > > >
> > > > > > > > > - We have moved development discussions to our Apache dev
> > > mailing
> &

Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
I think this conversation is worth having. I have cross posted this to
dev@druid.apache.org and will reply there. Since we're trying to migrate
the dev list, please cross post any dev messages there, or even only post
to that list.

Gian

On Sun, Apr 8, 2018 at 5:45 PM, Prashant Deva 
wrote:

> Current 0.12.0 release has some major issues:
>
>
>1. Coordinator loses leadership
>https://github.com/druid-io/druid/issues/5561
>
>2. Newly introduced Quantiles sketch is broken
>https://github.com/druid-io/druid/issues/5575
>
>3. Coordinator+overlord web console broken
>https://github.com/druid-io/druid/issues/5559
>
>
> 1. is especially very important. Without a coordinator, druid stops
> functioning.
> With bug 5561, it is impossible to use druid for long periods since
> coordinator eventually does lose leadership and the whole process needs to
> be restarted for it to come back.
>
> *Why not wait till 0.13.0?*
>
> A lot of companies like to update one version at a time and may not want
> to jump directly to 0.13.0.
> Those companies will hit a bad surprise due to bug 5561 essentially
> rendering the cluster useless in production.
> Also quantiles being the new feature and broken does not look good either.
>
> 0.12.0 is a 'release', not an RC, thus marking it good for production, but
> the bugs listed above prevent it from being used as such.
> I highly recommend 0.12.1 release, thus marking the right version to
> upgrade to from 0.11.0
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/druid-development/165b491e-3ec2-4744-a228-d1270c9d283a%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


Re: [druid-dev] Load rule doesn't honor intervals properly

2018-04-09 Thread Gian Merlino
Looks like it is under review right now. Thanks for the patch.

Gian

On Sun, Apr 8, 2018 at 8:04 PM, 'Pala Muthiah' via Druid Development <
druid-developm...@googlegroups.com> wrote:

> Hi Gian,
>
> Thanks for following up. I have submitted a patch: https://github.com/
> druid-io/druid/pull/5595.
>
> Whoever is the right owner please take a look - let me know if i should @
> a specific person and i can do that.
>
>
> Thanks,
> pala
>
>
>
> On Fri, Mar 30, 2018 at 2:12 PM, Gian Merlino <g...@imply.io> wrote:
>
>> Hi Pala,
>>
>> That sounds like a bug to me - a patch would be welcome!
>>
>> Btw, since we are trying to migrate the dev mailing list to Apache,
>> please cross post this sort of thing with d...@druid.incubator.apache.org,
>> or even only post to that list.
>>
>> Gian
>>
>> On Thu, Mar 29, 2018 at 5:43 PM, 'Pala Muthiah' via Druid Development <
>> druid-developm...@googlegroups.com> wrote:
>>
>>> Hello folks,
>>>
>>> Anybody have insight on the below? Curious to know if there would be
>>> unforeseen side effects if we count even partial overlap as valid.
>>>
>>> On Mon, Mar 19, 2018 at 10:54 AM, pala.muthiah via Druid Development <
>>> druid-developm...@googlegroups.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In our deployment, we enabled background segment merging and found that
>>>> some of the data within the load period was actually getting dropped.
>>>>
>>>> My suspicion was that when a merged segment only partially overlaps
>>>> with a period (e.g: Rule says keep data from Jan 1st onwards, and i have a
>>>> segment that spans Dec 25th - Jan 2nd), for correctness that segment should
>>>> be kept but current implementation seems to drop it.
>>>>
>>>> I checked the code and found indeed Rules.eligibleForLoad() only keeps
>>>> segments that overlap fully.
>>>>
>>>> Is this a bug, or is there other reason behind this? In our case, we do
>>>> have data sources that are highly aggregated and therefore a single segment
>>>> could span a month for example.
>>>>
>>>> I can submit a patch but wanted to get proper context.
>>>>
>>>>
>>>> Thanks,
>>>> pala
>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "Druid Development" group.
>>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> druid-development+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to druid-developm...@googlegroups.com
>>>> .
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%4
>>>> 0googlegroups.com
>>>> <https://groups.google.com/d/msgid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%40googlegroups.com?utm_medium=email_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Druid Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to druid-development+unsubscr...@googlegroups.com.
>>> To post to this group, send email to druid-developm...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3D
>>> P4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3DP4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com?utm_medium=email_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Druid Development" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> druid-development+unsubscr...@googlegroups.com

Re: Podling report - April 2018

2018-04-04 Thread Gian Merlino
I saved the report to the incubator wiki at https://wiki.apache.org/
incubator/April2018.

What's the directory you're looking at for ICLAs?

On Tue, Apr 3, 2018 at 4:43 PM, Xavier Léauté <xav...@confluent.io> wrote:

> I think I've seen ICLAs go through for most of the committers.
> The only ones I can't find in the directory are Parag and Roman.
>
> Everyone that hasn't updated the incubator page at
> http://incubator.apache.org/projects/druid.html with their apache id,
> would
> you mind doing so?
>
> On Tue, Apr 3, 2018 at 4:31 PM Gian Merlino <gianmerl...@gmail.com> wrote:
>
> > Hi Druids,
> >
> > Here is a draft of our podling report for April 2018. Let me know what
> you
> > think. It's due tomorrow EOD so I will post it tomorrow.
> >
> > 
> >
> > Three most important issues to address in the move towards graduation:
> >
> >  1. Complete SGA for current sources and ICLAs for current committers.
> >  2. Move the source code and website to Apache infrastructure.
> >  3. Plan and execute our first Apache release.
> >
> > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> > aware of?
> >
> > - None.
> >
> > How has the community developed since the last report?
> >
> > - We have moved development discussions to our Apache dev mailing list.
> > - A healthy, constant flow of bug fixes, quality improvements and new
> > features
> >   are still ongoing on https://github.com/druid-io/druid.
> >
> > How has the project developed since the last report?
> >
> > - Since the last report there have been 36 commits from 18 individuals.
> > - We have released Druid 0.12.0 (outside the Incubator). We are
> optimistic
> >   that our next release will be done as an Apache release.
> >
> > How would you assess the podling's maturity?
> > Please feel free to add your own commentary.
> >
> >   [X] Initial setup
> >   [ ] Working towards first release
> >   [ ] Community building
> >   [ ] Nearing graduation
> >   [ ] Other:
> >
> > Date of last release:
> >
> > - Druid 0.12.0 on 2018-03-06 (non-Apache release)
> > - No official release yet since beginning Apache Incubation
> >
> > When were the last committers or PPMC members elected?
> >
> > - Project is still functioning with the initial set of committers.
> >
> > On Tue, Apr 3, 2018 at 11:43 AM, Gian Merlino <g...@apache.org> wrote:
> >
> > > Hi Druids,
> > >
> > > Our podling report is due tomorrow and I am starting work on it.
> > >
> > > I will post a draft here before editing the incubator wiki.
> > >
> >
>


Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
My feeling is that #3 and #2 are borderline, but #1 definitely warrants a
new release. Personally I have seen it occur at least a half dozen times,
and I had been thinking about proposing a Druid 0.12.1 release, so I'm glad
you brought it up.

If we do 0.12.1, it would be another non-ASF release (we haven't got the
ASF process set up yet, and are not likely to have it set up in time) so we
should notify the incubator folks about it.

I would also consider including these fixes in 0.12.1:

- DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFactory.
(#5587)
- ArrayAggregation: Use long to avoid overflow (#5544)
- Respect forceHashAggregation in queryContext (#5533)
- Fix indexTask to respect forceExtendableShardSpecs (#5509)
- Add overlord unsecured paths to coordinator when using combined service
(#5579)
- Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554)
- Fix supervisor tombstone auth handling (#5504)
- Authorize supervisor history instead of current active supervisors for
supervisor history API (#5501)
- Fix round robining in router. (#5500)
- SegmentMetadataQuery: Fix default interval handling. (#5489)
- Log exceptions thrown before persist() for indexing tasks (#5374)
- More memory limiting for HttpPostEmitter (#5300)
- pass configuration from context into JobConf for determining
DatasourceInputFormat splits (#5408)
- Lookups: Inherit "injective" from registered lookups, improve docs.
(#5316)
- SQL: Throttle metadata refreshes when they fail. (#5328)

On Mon, Apr 9, 2018 at 10:35 AM, Gian Merlino <g...@imply.io> wrote:

> I think this conversation is worth having. I have cross posted this to
> dev@druid.apache.org and will reply there. Since we're trying to migrate
> the dev list, please cross post any dev messages there, or even only post
> to that list.
>
> Gian
>
> On Sun, Apr 8, 2018 at 5:45 PM, Prashant Deva <prashant.d...@gmail.com>
> wrote:
>
>> Current 0.12.0 release has some major issues:
>>
>>
>>1. Coordinator loses leadership
>>https://github.com/druid-io/druid/issues/5561
>>
>>2. Newly introduced Quantiles sketch is broken
>>https://github.com/druid-io/druid/issues/5575
>>
>>3. Coordinator+overlord web console broken
>>https://github.com/druid-io/druid/issues/5559
>>
>>
>> 1. is especially very important. Without a coordinator, druid stops
>> functioning.
>> With bug 5561, it is impossible to use druid for long periods since
>> coordinator eventually does lose leadership and the whole process needs to
>> be restarted for it to come back.
>>
>> *Why not wait till 0.13.0?*
>>
>> A lot of companies like to update one version at a time and may not want
>> to jump directly to 0.13.0.
>> Those companies will hit a bad surprise due to bug 5561 essentially
>> rendering the cluster useless in production.
>> Also quantiles being the new feature and broken does not look good either.
>>
>> 0.12.0 is a 'release', not an RC, thus marking it good for production,
>> but the bugs listed above prevent it from being used as such.
>> I highly recommend 0.12.1 release, thus marking the right version to
>> upgrade to from 0.11.0
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Druid Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to druid-development+unsubscr...@googlegroups.com.
>> To post to this group, send email to druid-developm...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-development/165b491e-3ec2-4744-a228-d1270c9d283a%4
>> 0googlegroups.com
>> <https://groups.google.com/d/msgid/druid-development/165b491e-3ec2-4744-a228-d1270c9d283a%40googlegroups.com?utm_medium=email_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/CACZNdYDbTjn%2BVZfnK-zuugw_
> E1efUNu2F4G%2BJTZJa0h7J-iK3w%40mail.gmail.com
> <https://groups.google.com/d/msgid/druid-development/CACZNdYDbTjn%2BVZfnK-zuugw_E1efUNu2F4G%2BJTZJa0h7J-iK3w%40mail.gmail.com?utm_medium=email_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
It looks like there's a lot of support for doing a 0.12.1… I'll make a
milestone in GH.

On Mon, Apr 9, 2018 at 12:12 PM, Slim Bouguerra <bs...@apache.org> wrote:

> +1 especially there is some instability bugs.
>
> On Apr 9, 2018, at 11:57 AM, Gian Merlino <g...@apache.org> wrote:
>
> My feeling is that #3 and #2 are borderline, but #1 definitely warrants a
> new release. Personally I have seen it occur at least a half dozen times,
> and I had been thinking about proposing a Druid 0.12.1 release, so I'm glad
> you brought it up.
>
> If we do 0.12.1, it would be another non-ASF release (we haven't got the
> ASF process set up yet, and are not likely to have it set up in time) so we
> should notify the incubator folks about it.
>
> I would also consider including these fixes in 0.12.1:
>
> - DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFactory.
> (#5587)
> - ArrayAggregation: Use long to avoid overflow (#5544)
> - Respect forceHashAggregation in queryContext (#5533)
> - Fix indexTask to respect forceExtendableShardSpecs (#5509)
> - Add overlord unsecured paths to coordinator when using combined service
> (#5579)
> - Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554)
> - Fix supervisor tombstone auth handling (#5504)
> - Authorize supervisor history instead of current active supervisors for
> supervisor history API (#5501)
> - Fix round robining in router. (#5500)
> - SegmentMetadataQuery: Fix default interval handling. (#5489)
> - Log exceptions thrown before persist() for indexing tasks (#5374)
> - More memory limiting for HttpPostEmitter (#5300)
> - pass configuration from context into JobConf for determining
> DatasourceInputFormat splits (#5408)
> - Lookups: Inherit "injective" from registered lookups, improve docs.
> (#5316)
> - SQL: Throttle metadata refreshes when they fail. (#5328)
>
> On Mon, Apr 9, 2018 at 10:35 AM, Gian Merlino <g...@imply.io> wrote:
>
> I think this conversation is worth having. I have cross posted this to
> dev@druid.apache.org and will reply there. Since we're trying to migrate
> the dev list, please cross post any dev messages there, or even only post
> to that list.
>
> Gian
>
> On Sun, Apr 8, 2018 at 5:45 PM, Prashant Deva <prashant.d...@gmail.com>
> wrote:
>
> Current 0.12.0 release has some major issues:
>
>
>   1. Coordinator loses leadership
>   https://github.com/druid-io/druid/issues/5561
>
>   2. Newly introduced Quantiles sketch is broken
>   https://github.com/druid-io/druid/issues/5575
>
>   3. Coordinator+overlord web console broken
>   https://github.com/druid-io/druid/issues/5559
>
>
> 1. is especially very important. Without a coordinator, druid stops
> functioning.
> With bug 5561, it is impossible to use druid for long periods since
> coordinator eventually does lose leadership and the whole process needs to
> be restarted for it to come back.
>
> *Why not wait till 0.13.0?*
>
> A lot of companies like to update one version at a time and may not want
> to jump directly to 0.13.0.
> Those companies will hit a bad surprise due to bug 5561 essentially
> rendering the cluster useless in production.
> Also quantiles being the new feature and broken does not look good either.
>
> 0.12.0 is a 'release', not an RC, thus marking it good for production,
> but the bugs listed above prevent it from being used as such.
> I highly recommend 0.12.1 release, thus marking the right version to
> upgrade to from 0.11.0
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/druid-development/165b491e-3ec2-4744-a228-d1270c9d283a%4
> 0googlegroups.com
> <https://groups.google.com/d/msgid/druid-development/
> 165b491e-3ec2-4744-a228-d1270c9d283a%40googlegroups.
> com?utm_medium=email_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/CACZNdYDbTjn%2BVZfnK-zuugw_
> E1efUNu2F4G%2BJTZJa0h7J-iK3w%40mail.gmail.com
> <https://groups.google.com/d/msgid/druid-development/
> CACZNdYDbTjn%2BVZfnK-zuugw_E1efUNu2F4G%2BJTZJa0h7J-iK3w%
> 40mail.gmail.com?utm_medium=email_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
I raised backport PRs for everything on my list except for these three,
since they had some (minor) conflicts and I was using a backport script
that bails out if it sees any conflicts:

- https://github.com/druid-io/druid/pull/5504
- https://github.com/druid-io/druid/pull/5587
- https://github.com/druid-io/druid/pull/5374

I will take a closer look at them later if nobody else gets to them.

Nishant, I added https://github.com/druid-io/druid/pull/5596 to the
milestone.

Prashant, I added https://github.com/druid-io/druid/pull/5611 to the
milestone.

On Mon, Apr 9, 2018 at 1:10 PM, Prashant Deva <prashant.d...@gmail.com>
wrote:

> also for the list:
>
> - compact task throws exception #5611
>
> i dont have permission to add to gian's github milestone
>
> Prashant
>
> On Mon, Apr 9, 2018 at 12:26 PM, Gian Merlino <gianmerl...@gmail.com>
> wrote:
>
> > Here we go: https://github.com/druid-io/druid/milestone/26. Please add
> > stuff if it makes sense (high importance / low risk bug fixes).
> >
> > On Mon, Apr 9, 2018 at 12:14 PM, Gian Merlino <gianmerl...@gmail.com>
> > wrote:
> >
> > > It looks like there's a lot of support for doing a 0.12.1… I'll make a
> > > milestone in GH.
> > >
> > > On Mon, Apr 9, 2018 at 12:12 PM, Slim Bouguerra <bs...@apache.org>
> > wrote:
> > >
> > >> +1 especially there is some instability bugs.
> > >>
> > >> On Apr 9, 2018, at 11:57 AM, Gian Merlino <g...@apache.org> wrote:
> > >>
> > >> My feeling is that #3 and #2 are borderline, but #1 definitely
> warrants
> > a
> > >> new release. Personally I have seen it occur at least a half dozen
> > times,
> > >> and I had been thinking about proposing a Druid 0.12.1 release, so I'm
> > >> glad
> > >> you brought it up.
> > >>
> > >> If we do 0.12.1, it would be another non-ASF release (we haven't got
> the
> > >> ASF process set up yet, and are not likely to have it set up in time)
> so
> > >> we
> > >> should notify the incubator folks about it.
> > >>
> > >> I would also consider including these fixes in 0.12.1:
> > >>
> > >> - DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFa
> > ctory.
> > >> (#5587)
> > >> - ArrayAggregation: Use long to avoid overflow (#5544)
> > >> - Respect forceHashAggregation in queryContext (#5533)
> > >> - Fix indexTask to respect forceExtendableShardSpecs (#5509)
> > >> - Add overlord unsecured paths to coordinator when using combined
> > service
> > >> (#5579)
> > >> - Fix SQLMetadataSegmentManager to allow succesive start and stop
> > (#5554)
> > >> - Fix supervisor tombstone auth handling (#5504)
> > >> - Authorize supervisor history instead of current active supervisors
> for
> > >> supervisor history API (#5501)
> > >> - Fix round robining in router. (#5500)
> > >> - SegmentMetadataQuery: Fix default interval handling. (#5489)
> > >> - Log exceptions thrown before persist() for indexing tasks (#5374)
> > >> - More memory limiting for HttpPostEmitter (#5300)
> > >> - pass configuration from context into JobConf for determining
> > >> DatasourceInputFormat splits (#5408)
> > >> - Lookups: Inherit "injective" from registered lookups, improve docs.
> > >> (#5316)
> > >> - SQL: Throttle metadata refreshes when they fail. (#5328)
> > >>
> > >> On Mon, Apr 9, 2018 at 10:35 AM, Gian Merlino <g...@imply.io> wrote:
> > >>
> > >> I think this conversation is worth having. I have cross posted this to
> > >> dev@druid.apache.org and will reply there. Since we're trying to
> > migrate
> > >> the dev list, please cross post any dev messages there, or even only
> > post
> > >> to that list.
> > >>
> > >> Gian
> > >>
> > >> On Sun, Apr 8, 2018 at 5:45 PM, Prashant Deva <
> prashant.d...@gmail.com>
> > >> wrote:
> > >>
> > >> Current 0.12.0 release has some major issues:
> > >>
> > >>
> > >>   1. Coordinator loses leadership
> > >>   https://github.com/druid-io/druid/issues/5561
> > >>
> > >>   2. Newly introduced Quantiles sketch is broken
> > >>   https://github.com/druid-io/druid/issues/5575
> > >>
> > >>   3. Coordinator+overlord web console broken
> > >&g

Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
I backported 5504 and 5587. I skipped 5374 since it turns out it isn't
relevant to 0.12.0 -- it fixes something introduced since then -- so I
moved that back to 0.13.0.

On Mon, Apr 9, 2018 at 1:56 PM, Gian Merlino <g...@apache.org> wrote:

> I raised backport PRs for everything on my list except for these three,
> since they had some (minor) conflicts and I was using a backport script
> that bails out if it sees any conflicts:
>
> - https://github.com/druid-io/druid/pull/5504
> - https://github.com/druid-io/druid/pull/5587
> - https://github.com/druid-io/druid/pull/5374
>
> I will take a closer look at them later if nobody else gets to them.
>
> Nishant, I added https://github.com/druid-io/druid/pull/5596 to the
> milestone.
>
> Prashant, I added https://github.com/druid-io/druid/pull/5611 to the
> milestone.
>
> On Mon, Apr 9, 2018 at 1:10 PM, Prashant Deva <prashant.d...@gmail.com>
> wrote:
>
>> also for the list:
>>
>> - compact task throws exception #5611
>>
>> i dont have permission to add to gian's github milestone
>>
>> Prashant
>>
>> On Mon, Apr 9, 2018 at 12:26 PM, Gian Merlino <gianmerl...@gmail.com>
>> wrote:
>>
>> > Here we go: https://github.com/druid-io/druid/milestone/26. Please add
>> > stuff if it makes sense (high importance / low risk bug fixes).
>> >
>> > On Mon, Apr 9, 2018 at 12:14 PM, Gian Merlino <gianmerl...@gmail.com>
>> > wrote:
>> >
>> > > It looks like there's a lot of support for doing a 0.12.1… I'll make a
>> > > milestone in GH.
>> > >
>> > > On Mon, Apr 9, 2018 at 12:12 PM, Slim Bouguerra <bs...@apache.org>
>> > wrote:
>> > >
>> > >> +1 especially there is some instability bugs.
>> > >>
>> > >> On Apr 9, 2018, at 11:57 AM, Gian Merlino <g...@apache.org> wrote:
>> > >>
>> > >> My feeling is that #3 and #2 are borderline, but #1 definitely
>> warrants
>> > a
>> > >> new release. Personally I have seen it occur at least a half dozen
>> > times,
>> > >> and I had been thinking about proposing a Druid 0.12.1 release, so
>> I'm
>> > >> glad
>> > >> you brought it up.
>> > >>
>> > >> If we do 0.12.1, it would be another non-ASF release (we haven't got
>> the
>> > >> ASF process set up yet, and are not likely to have it set up in
>> time) so
>> > >> we
>> > >> should notify the incubator folks about it.
>> > >>
>> > >> I would also consider including these fixes in 0.12.1:
>> > >>
>> > >> - DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFa
>> > ctory.
>> > >> (#5587)
>> > >> - ArrayAggregation: Use long to avoid overflow (#5544)
>> > >> - Respect forceHashAggregation in queryContext (#5533)
>> > >> - Fix indexTask to respect forceExtendableShardSpecs (#5509)
>> > >> - Add overlord unsecured paths to coordinator when using combined
>> > service
>> > >> (#5579)
>> > >> - Fix SQLMetadataSegmentManager to allow succesive start and stop
>> > (#5554)
>> > >> - Fix supervisor tombstone auth handling (#5504)
>> > >> - Authorize supervisor history instead of current active supervisors
>> for
>> > >> supervisor history API (#5501)
>> > >> - Fix round robining in router. (#5500)
>> > >> - SegmentMetadataQuery: Fix default interval handling. (#5489)
>> > >> - Log exceptions thrown before persist() for indexing tasks (#5374)
>> > >> - More memory limiting for HttpPostEmitter (#5300)
>> > >> - pass configuration from context into JobConf for determining
>> > >> DatasourceInputFormat splits (#5408)
>> > >> - Lookups: Inherit "injective" from registered lookups, improve docs.
>> > >> (#5316)
>> > >> - SQL: Throttle metadata refreshes when they fail. (#5328)
>> > >>
>> > >> On Mon, Apr 9, 2018 at 10:35 AM, Gian Merlino <g...@imply.io> wrote:
>> > >>
>> > >> I think this conversation is worth having. I have cross posted this
>> to
>> > >> dev@druid.apache.org and will reply there. Since we're trying to
>> > migrate
>> > >> the dev list, please cross post any dev messages there, or even only
>> > post
>> > >> to that list.
>> > >

Re: [druid-dev] consider doing a 0.12.1 release

2018-04-09 Thread Gian Merlino
Here we go: https://github.com/druid-io/druid/milestone/26. Please add
stuff if it makes sense (high importance / low risk bug fixes).

On Mon, Apr 9, 2018 at 12:14 PM, Gian Merlino <gianmerl...@gmail.com> wrote:

> It looks like there's a lot of support for doing a 0.12.1… I'll make a
> milestone in GH.
>
> On Mon, Apr 9, 2018 at 12:12 PM, Slim Bouguerra <bs...@apache.org> wrote:
>
>> +1 especially there is some instability bugs.
>>
>> On Apr 9, 2018, at 11:57 AM, Gian Merlino <g...@apache.org> wrote:
>>
>> My feeling is that #3 and #2 are borderline, but #1 definitely warrants a
>> new release. Personally I have seen it occur at least a half dozen times,
>> and I had been thinking about proposing a Druid 0.12.1 release, so I'm
>> glad
>> you brought it up.
>>
>> If we do 0.12.1, it would be another non-ASF release (we haven't got the
>> ASF process set up yet, and are not likely to have it set up in time) so
>> we
>> should notify the incubator folks about it.
>>
>> I would also consider including these fixes in 0.12.1:
>>
>> - DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFactory.
>> (#5587)
>> - ArrayAggregation: Use long to avoid overflow (#5544)
>> - Respect forceHashAggregation in queryContext (#5533)
>> - Fix indexTask to respect forceExtendableShardSpecs (#5509)
>> - Add overlord unsecured paths to coordinator when using combined service
>> (#5579)
>> - Fix SQLMetadataSegmentManager to allow succesive start and stop (#5554)
>> - Fix supervisor tombstone auth handling (#5504)
>> - Authorize supervisor history instead of current active supervisors for
>> supervisor history API (#5501)
>> - Fix round robining in router. (#5500)
>> - SegmentMetadataQuery: Fix default interval handling. (#5489)
>> - Log exceptions thrown before persist() for indexing tasks (#5374)
>> - More memory limiting for HttpPostEmitter (#5300)
>> - pass configuration from context into JobConf for determining
>> DatasourceInputFormat splits (#5408)
>> - Lookups: Inherit "injective" from registered lookups, improve docs.
>> (#5316)
>> - SQL: Throttle metadata refreshes when they fail. (#5328)
>>
>> On Mon, Apr 9, 2018 at 10:35 AM, Gian Merlino <g...@imply.io> wrote:
>>
>> I think this conversation is worth having. I have cross posted this to
>> dev@druid.apache.org and will reply there. Since we're trying to migrate
>> the dev list, please cross post any dev messages there, or even only post
>> to that list.
>>
>> Gian
>>
>> On Sun, Apr 8, 2018 at 5:45 PM, Prashant Deva <prashant.d...@gmail.com>
>> wrote:
>>
>> Current 0.12.0 release has some major issues:
>>
>>
>>   1. Coordinator loses leadership
>>   https://github.com/druid-io/druid/issues/5561
>>
>>   2. Newly introduced Quantiles sketch is broken
>>   https://github.com/druid-io/druid/issues/5575
>>
>>   3. Coordinator+overlord web console broken
>>   https://github.com/druid-io/druid/issues/5559
>>
>>
>> 1. is especially very important. Without a coordinator, druid stops
>> functioning.
>> With bug 5561, it is impossible to use druid for long periods since
>> coordinator eventually does lose leadership and the whole process needs to
>> be restarted for it to come back.
>>
>> *Why not wait till 0.13.0?*
>>
>> A lot of companies like to update one version at a time and may not want
>> to jump directly to 0.13.0.
>> Those companies will hit a bad surprise due to bug 5561 essentially
>> rendering the cluster useless in production.
>> Also quantiles being the new feature and broken does not look good either.
>>
>> 0.12.0 is a 'release', not an RC, thus marking it good for production,
>> but the bugs listed above prevent it from being used as such.
>> I highly recommend 0.12.1 release, thus marking the right version to
>> upgrade to from 0.11.0
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Druid Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to druid-development+unsubscr...@googlegroups.com.
>> To post to this group, send email to druid-developm...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-development/165b491e-3ec2-4744-a228-d1270c9d283a%4
>> 0googlegroups.com
>> <https://groups.google.com/d/msgid/druid-development/
>> 165b491e-3ec2-4744-a228-d1270c9d283a%40googlegroups.
>> 

Re: Podling report - April 2018

2018-04-06 Thread Gian Merlino
If you restart from the beginning, then the new committer process is to
fill out the form linked from http://www.apache.org/licenses/#clas, choose
an apache id that isn't used (http://people.apache.org/committer-index.html),
and include "druid" in the "project to notify" field.

On Fri, Apr 6, 2018 at 9:29 AM, Roman Leventov <leventov...@gmail.com>
wrote:

> If I go this path, what I should actually do? Sorry I probably missed the
> instruction
>
> On Fri, 6 Apr 2018, 19:26 Xavier Léauté, <xav...@confluent.io> wrote:
>
> > Otherwise, I would suggest Roman just request an Apache ID directly and
> > re-sign the icla like everyone else did. That will probably be faster
> than
> > tracking down the old cla.
> >
> > On Fri, Apr 6, 2018 at 12:05 AM Gian Merlino <gianmerl...@gmail.com>
> > wrote:
> >
> > > https://community.apache.org/newcommitter.html says that new accounts
> > are
> > > created by PMC chairs. In our case that would probably be the Incubator
> > > PMC? So I'd try emailing gene...@incubator.apache.org and seeing if
> > > someone
> > > there can help.
> > >
> > > On Thu, Apr 5, 2018 at 12:16 AM, Roman Leventov <leventov...@gmail.com
> >
> > > wrote:
> > >
> > > > No I don't have an ID, just signed for making a PR to some project.
> > > >
> > > > On Thu, 5 Apr 2018, 02:49 Parag Jain, <paragjai...@gmail.com> wrote:
> > > >
> > > > > I've signed the ICLA today. Will update the page once I get the
> > Apache
> > > > id.
> > > > >
> > > > > On Wed, Apr 4, 2018, 6:16 PM Xavier Léauté <xav...@confluent.io>
> > > wrote:
> > > > >
> > > > > > Roman, in that case you should already have an Apache id, can you
> > add
> > > > it
> > > > > to
> > > > > > the page?
> > > > > > On Wed, Apr 4, 2018 at 16:00 Roman Leventov <
> leventov...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I've signed an ICLA for a different project before, doesn't it
> > > count?
> > > > > > >
> > > > > > > On Thu, 5 Apr 2018, 01:39 Gian Merlino, <gianmerl...@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Ah okay cool, thanks
> > > > > > > >
> > > > > > > > On Wed, Apr 4, 2018 at 11:27 AM, Xavier Léauté <
> > > > xav...@confluent.io>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I was just looking at http://people.apache.org/
> > > > > > > > >
> > > > > > > > > On Wed, Apr 4, 2018 at 7:59 AM Gian Merlino <
> > > > gianmerl...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I saved the report to the incubator wiki at
> > > > > > https://wiki.apache.org/
> > > > > > > > > > incubator/April2018 <https://wiki.apache.org/
> > > > incubator/April2018
> > > > > >.
> > > > > > > > > >
> > > > > > > > > > What's the directory you're looking at for ICLAs?
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 3, 2018 at 4:43 PM, Xavier Léauté <
> > > > > xav...@confluent.io
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I think I've seen ICLAs go through for most of the
> > > > committers.
> > > > > > > > > > > The only ones I can't find in the directory are Parag
> and
> > > > > Roman.
> > > > > > > > > > >
> > > > > > > > > > > Everyone that hasn't updated the incubator page at
> > > > > > > > > > > http://incubator.apache.org/projects/druid.html with
> > their
> > > > > > apache
> > > > > > > > id,
> > > > > > > > > > > would
> > > > > > > > > > > you mind doing so?
> > > > > > > > > > >
&g

Re: Dependencies licenses Report

2018-04-18 Thread Gian Merlino
Hi Slim,

Do you know if ORC & Hive use this tool as part of their release process?
And if it's considered a good tool by itself for verifying we meet all of
the Apache licensing requirements, or if we'll need something else too?

On Tue, Apr 17, 2018 at 9:15 PM, Slim Bouguerra  wrote:

> One of the question last dev synch was about the generation of dependency
> licenses.
> Some projects (ORC and Hive) use the maven site plugin that can generates
> reports with all the dependencies and licenses details.
> I have run it on Druid and this is how it looks for Druid Api Module.
> cmd
>
> mvn project-info-reports:dependencies
>
> The site directory can be found under target/site
> here is an example for one module
>  https://drive.google.com/file/d/1P8R0kZjp8zP4WSOVrKdlJF7Xr8-
> OI7Oe/view?usp=sharing
>
> Also no fancy tools used to detect unwanted licenses, it is done while
> reviewing PR
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>


Today's dev sync

2018-04-17 Thread Gian Merlino
https://hangouts.google.com/hangouts/_/calendar/Z2lhbm1lcmxpbm9AZ21haWwuY29t.73eguu7d0a9gr27kt1ugitllq3


Re: Dependencies licenses Report

2018-04-20 Thread Gian Merlino
Does anyone have experience with RAT (https://creadur.apache.org/rat/) and
a willingness to do a PR to set it up for us? I think we can do this even
before migrating sources to Apache.

On Fri, Apr 20, 2018 at 10:52 AM, Slim Bouguerra <slim.bougue...@gmail.com>
wrote:

> As Suggested above, RAT is used as a first filter that does most of the
> checking but it is not 100% enough.
> The mvn site plugin is used to collect list of dependencies but it is not
> enough as well.
> They manually edit/create the Licenses/Notice files. It is done by
> hand/a_human to avoid any glitch that an automatic tool will introduce and
> to insure that someone has looked at it.
> Seems like it is time consuming the first time but then it should be
> incremental thus not that hard.
>
>
>
> On Wed, Apr 18, 2018 at 8:45 AM, Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
>
> > The main tool to use is Apache RAT. Definitely use that.
> >
> > One of the hardest tasks is getting the contents of LICENSE and NOTICE
> > right. That is a manual task I’m afraid.
> >
> > Julian
> >
> > > On Apr 18, 2018, at 08:34, Gian Merlino <gianmerl...@gmail.com> wrote:
> > >
> > > Hi Slim,
> > >
> > > Do you know if ORC & Hive use this tool as part of their release
> process?
> > > And if it's considered a good tool by itself for verifying we meet all
> of
> > > the Apache licensing requirements, or if we'll need something else too?
> > >
> > >> On Tue, Apr 17, 2018 at 9:15 PM, Slim Bouguerra <bs...@apache.org>
> > wrote:
> > >>
> > >> One of the question last dev synch was about the generation of
> > dependency
> > >> licenses.
> > >> Some projects (ORC and Hive) use the maven site plugin that can
> > generates
> > >> reports with all the dependencies and licenses details.
> > >> I have run it on Druid and this is how it looks for Druid Api Module.
> > >> cmd
> > >>
> > >> mvn project-info-reports:dependencies
> > >>
> > >> The site directory can be found under target/site
> > >> here is an example for one module
> > >> https://drive.google.com/file/d/1P8R0kZjp8zP4WSOVrKdlJF7Xr8-
> > >> OI7Oe/view?usp=sharing
> > >>
> > >> Also no fancy tools used to detect unwanted licenses, it is done while
> > >> reviewing PR
> > >>
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > >> For additional commands, e-mail: dev-h...@druid.apache.org
> > >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > For additional commands, e-mail: dev-h...@druid.apache.org
> >
> >
>
>
> --
>
> B-Slim
> ___/\/\/\___/\/\/\___/\/\/\___/\/\/\___/\/\/\___
>


Re: Web site

2018-04-16 Thread Gian Merlino
We have a bit of a hybrid setup today: the docs (a big part of the site)
are in the main "druid" repo. The rest of the site (landing page, news
page, download page) are in a separate website repo. It makes sense to me
because we want to version the docs along with the code, but we _don't_
want to version stuff like the landing page and news as part of the code,
we'd rather they "float" as a separate thing.

The one thing I don't like about how we do the site today is the problem
you mentioned: the built site branch does get large, since it has docs for
every version we've ever released.

Maybe it makes sense to have three repos: one for the sources and
docs/tutorials (which are tied to the sources), one for the "floating"
parts of the site that are untethered to any particular release, and one
for the built site?

On Mon, Apr 16, 2018 at 9:44 AM, Julian Hyde <jh...@apache.org> wrote:

> (Speaking not as a mentor, just someone who has deployed sites on ASF
> infrastructure.)
>
> What makes most sense to me is to put the web site source in master along
> with the source code, and to put the generated site in a different git repo
> (not just a different branch). It allows you to make a commit that changes
> both source code and the web site.
>
> And it prevents the source repo from becoming really large (generated web
> sites can be large if you deploy 100M of generated java doc each release).
> You don’t want every contributor to have to download a huge git repo.
>
> Julian
>
>
> > On Apr 16, 2018, at 8:48 AM, Gian Merlino <g...@apache.org> wrote:
> >
> > A technical note that I also posted in the "migration logistics" thread:
> > the sources for the site are at
> > https://github.com/apache/incubator-druid-website. The branch "asf-git"
> is
> > served on the site https://druid.incubator.apache.org/. I think once we
> > migrate http://druid.io/, we could do something similar to what we do on
> > https://github.com/druid-io/druid-io.github.io, where sources are in
> "src"
> > and the built site is in "master". Except when using ASF infra, it makes
> > more sense to put the sources in "master" and the built site in
> "asf-git".
> >
> >
> > On Sun, Apr 15, 2018 at 3:58 PM, Julian Hyde <jh...@apache.org> wrote:
> >
> >> There has been some back-channel discussion about the web site.
> >>
> >> Druid has a good and successful web site outside of Apache, namely
> >> druid.io <http://druid.io/>. We cannot start transitioning that site
> >> until the legal IP transfer has completed. In the mean time, we were
> left
> >> without a web site: requests to http://druid.apache.org/ <
> >> http://druid.apache.org/> and http://druid.incubator.apache.org/ <
> >> http://druid.incubator.apache.org/> would receive an HTTP 404.
> >>
> >> Gian has created a simple web site in Apache that has hyperlinks to
> >> druid.io <http://druid.io/> and references the current user list
> >> druid-u...@googlegroups.com <mailto:druid-u...@googlegroups.com>. Links
> >> to outside Apache regarded as a breach of Apache branding policy and are
> >> frowned upon; but as a mentor I totally understand why they are
> necessary:
> >> Druid has a great community, and we must protect that community during
> the
> >> transition to Apache.
> >>
> >> The current web site is good enough for the short-term, but let’s get a
> >> proper branding-compliant web site up and running as soon as we can.
> Let’s
> >> make it one of the “top three” tasks listed in each board report.
> >>
> >> I see [1] that Gian is pushing to move traffic from
> >> druid-...@googlegroups.com <mailto:druid-...@googlegroups.com> to this
> >> dev list. That effort is most welcome, also.
> >>
> >> Julian
> >>
> >> [1] https://groups.google.com/d/msg/druid-development/q1ip-
> >> L8xpBk/hDCYaIsQCgAJ <https://groups.google.com/d/
> >> msg/druid-development/q1ip-L8xpBk/hDCYaIsQCgAJ>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: Web site

2018-04-16 Thread Gian Merlino
A technical note that I also posted in the "migration logistics" thread:
the sources for the site are at
https://github.com/apache/incubator-druid-website. The branch "asf-git" is
served on the site https://druid.incubator.apache.org/. I think once we
migrate http://druid.io/, we could do something similar to what we do on
https://github.com/druid-io/druid-io.github.io, where sources are in "src"
and the built site is in "master". Except when using ASF infra, it makes
more sense to put the sources in "master" and the built site in "asf-git".


On Sun, Apr 15, 2018 at 3:58 PM, Julian Hyde  wrote:

> There has been some back-channel discussion about the web site.
>
> Druid has a good and successful web site outside of Apache, namely
> druid.io . We cannot start transitioning that site
> until the legal IP transfer has completed. In the mean time, we were left
> without a web site: requests to http://druid.apache.org/ <
> http://druid.apache.org/> and http://druid.incubator.apache.org/ <
> http://druid.incubator.apache.org/> would receive an HTTP 404.
>
> Gian has created a simple web site in Apache that has hyperlinks to
> druid.io  and references the current user list
> druid-u...@googlegroups.com . Links
> to outside Apache regarded as a breach of Apache branding policy and are
> frowned upon; but as a mentor I totally understand why they are necessary:
> Druid has a great community, and we must protect that community during the
> transition to Apache.
>
> The current web site is good enough for the short-term, but let’s get a
> proper branding-compliant web site up and running as soon as we can. Let’s
> make it one of the “top three” tasks listed in each board report.
>
> I see [1] that Gian is pushing to move traffic from
> druid-...@googlegroups.com  to this
> dev list. That effort is most welcome, also.
>
> Julian
>
> [1] https://groups.google.com/d/msg/druid-development/q1ip-
> L8xpBk/hDCYaIsQCgAJ  msg/druid-development/q1ip-L8xpBk/hDCYaIsQCgAJ>


Re: [druid-dev] Apache migration logistics

2018-04-15 Thread Gian Merlino
Also, the sources for the site are at:
https://github.com/apache/incubator-druid-website. The branch "asf-git" is
served on the site. I think once we migrate http://druid.io/, we could do
something similar to what we do on
https://github.com/druid-io/druid-io.github.io, where sources are in "src"
and the built site is in "master". Except in the ASF infra, it makes more
sense to put the sources in "master" and the built site in "asf-git".

On Sun, Apr 15, 2018 at 7:56 AM, Gian Merlino <gianmerl...@gmail.com> wrote:

> FYI: We received an inquiry about where our web site was, and so we set up
> a placeholder site at: http://druid.incubator.apache.org/.
>
> Last I heard, the SGA paperwork is still being worked on, and after that
> is done we should start migrating the sources and site.
>
> In the meantime should we consider setting a cutoff date for shutting down
> druid-developm...@googlegroups.com list, in favor of dev@druid.apache.org?
> We have been encouraging people to use the latter list for a few weeks now.
> Maybe one more week: April 23?
>
> On Tue, Mar 27, 2018 at 6:19 PM, Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
>
>> Per https://incubator.apache.org/guides/transitioning_asf.html we need
>> to get SGA/CCLA on file, then we should file a JIRA case similar to
>> https://issues.apache.org/jira/browse/INFRA-15735 or http
>> s://issues.apache.org/jira/browse/INFRA-12261 to do the import.
>>
>> Julian
>>
>>
>> On Mar 27, 2018, at 5:20 PM, Gian Merlino <gianmerl...@gmail.com> wrote:
>>
>> Hi, today I found myself wondering about migration of source repos.
>>
>> Does anyone know if we've got our GitBox git repo set up? And who we need
>> to talk to about getting the druid-io repos transferred to the apache org
>> in github such that we can do source control in an Apache-certified-okay
>> way? It sounded like we are going to be able to keep using GitHub PRs and
>> issues. So I'm hoping we can do this process:
>> https://help.github.com/articles/about-repository-transfers/ which lets
>> us
>> keep all the issues, watchers, & stars intact.
>>
>> On Mon, Mar 12, 2018 at 3:25 PM, Xavier Léauté <xav...@confluent.io>
>> wrote:
>>
>> FYI, to update your information on the status page you need to check out
>> https://svn.apache.org/repos/asf/incubator/public/trunk/content/projects/
>> with
>> your Apache credentials and update the druid.xml file in that directory.
>>
>> On Mon, Mar 12, 2018 at 2:49 PM Gian Merlino <g...@imply.io> wrote:
>>
>> Committers: please,
>>
>> 1) If you don't have an apache id already, fill out an ICLA:
>> https://www.apache.org/dev/new-committers-guide.html#
>>
>> guide-for-new-committers and
>>
>> then post here and hopefully someone can figure out how to get you an id?
>>
>> 2) When you have an id, post it here if it's not in
>> http://incubator.apache.org/projects/druid.html so someone can figure
>>
>> out
>>
>> how to add you to that, and then also try to sign up to
>> private-subscr...@druid.apache.org (+ dev-subscr...@druid.apache.org
>> which you should be on already). If you can't, then also post here, and
>> hopefully someone can figure _that_ out.
>>
>> Gian
>>
>> On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté <xav...@confluent.io>
>> wrote:
>>
>> This thread is already going to both lists, and it looks like responses
>> automatically go to both. Would be good to check what happens if we
>> subscribe dev@ to the google group. If responding from the apache list
>> doesn't automatically add the google group as well, it will be hard to
>>
>> keep
>>
>> the group useful.
>>
>> Agree with Julian a cutoff is necessary anyway, since the google group
>> inherently becomes less useful over time, as some information only ends
>>
>> up
>>
>> in the apache list.
>>
>> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa <
>> nishant.mon...@gmail.com> wrote:
>>
>> We can register dev@druid.apache.org and us...@druid.apache.org as a
>> user in druid user groups so that going forward any mails that are
>>
>> sent to
>>
>> druid google groups are also received on the apache lists and is on the
>> record. This would be to bridge the gap during the migration only.
>>
>> @Julian, I go ahead and try setting this up, If this seems reasonable ?
>>
>> On Sat, 10 Mar 2018 at 00:09 Julian Hyde <jhyde.apa...@gmail.com>
>>
>> wrote:
>>
>&

Re: [druid-dev] Apache migration logistics

2018-04-15 Thread Gian Merlino
FYI: We received an inquiry about where our web site was, and so we set up
a placeholder site at: http://druid.incubator.apache.org/.

Last I heard, the SGA paperwork is still being worked on, and after that is
done we should start migrating the sources and site.

In the meantime should we consider setting a cutoff date for shutting down
druid-developm...@googlegroups.com list, in favor of dev@druid.apache.org?
We have been encouraging people to use the latter list for a few weeks now.
Maybe one more week: April 23?

On Tue, Mar 27, 2018 at 6:19 PM, Julian Hyde <jhyde.apa...@gmail.com> wrote:

> Per https://incubator.apache.org/guides/transitioning_asf.html we need to
> get SGA/CCLA on file, then we should file a JIRA case similar to
> https://issues.apache.org/jira/browse/INFRA-15735 or htt
> ps://issues.apache.org/jira/browse/INFRA-12261 to do the import.
>
> Julian
>
>
> On Mar 27, 2018, at 5:20 PM, Gian Merlino <gianmerl...@gmail.com> wrote:
>
> Hi, today I found myself wondering about migration of source repos.
>
> Does anyone know if we've got our GitBox git repo set up? And who we need
> to talk to about getting the druid-io repos transferred to the apache org
> in github such that we can do source control in an Apache-certified-okay
> way? It sounded like we are going to be able to keep using GitHub PRs and
> issues. So I'm hoping we can do this process:
> https://help.github.com/articles/about-repository-transfers/ which lets us
> keep all the issues, watchers, & stars intact.
>
> On Mon, Mar 12, 2018 at 3:25 PM, Xavier Léauté <xav...@confluent.io>
> wrote:
>
> FYI, to update your information on the status page you need to check out
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/projects/
> with
> your Apache credentials and update the druid.xml file in that directory.
>
> On Mon, Mar 12, 2018 at 2:49 PM Gian Merlino <g...@imply.io> wrote:
>
> Committers: please,
>
> 1) If you don't have an apache id already, fill out an ICLA:
> https://www.apache.org/dev/new-committers-guide.html#
>
> guide-for-new-committers and
>
> then post here and hopefully someone can figure out how to get you an id?
>
> 2) When you have an id, post it here if it's not in
> http://incubator.apache.org/projects/druid.html so someone can figure
>
> out
>
> how to add you to that, and then also try to sign up to
> private-subscr...@druid.apache.org (+ dev-subscr...@druid.apache.org
> which you should be on already). If you can't, then also post here, and
> hopefully someone can figure _that_ out.
>
> Gian
>
> On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté <xav...@confluent.io>
> wrote:
>
> This thread is already going to both lists, and it looks like responses
> automatically go to both. Would be good to check what happens if we
> subscribe dev@ to the google group. If responding from the apache list
> doesn't automatically add the google group as well, it will be hard to
>
> keep
>
> the group useful.
>
> Agree with Julian a cutoff is necessary anyway, since the google group
> inherently becomes less useful over time, as some information only ends
>
> up
>
> in the apache list.
>
> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa <
> nishant.mon...@gmail.com> wrote:
>
> We can register dev@druid.apache.org and us...@druid.apache.org as a
> user in druid user groups so that going forward any mails that are
>
> sent to
>
> druid google groups are also received on the apache lists and is on the
> record. This would be to bridge the gap during the migration only.
>
> @Julian, I go ahead and try setting this up, If this seems reasonable ?
>
> On Sat, 10 Mar 2018 at 00:09 Julian Hyde <jhyde.apa...@gmail.com>
>
> wrote:
>
>
> I don’t know. I don’t think it’s easy.
>
>
> On Mar 9, 2018, at 7:31 AM, Roman Leventov <
> roman.leven...@metamarkets.com> wrote:
>
> Could archives of druid-dev and druid-users mailing lists be
> transferred to the new lists?
>
> On Thu, Mar 8, 2018 at 8:48 AM, Julian Hyde <jh...@apache.org> wrote:
>
> I’m working on it. It turns that I don’t have sufficient karma to
> create a git repo, so I’ve put in a request on the incubator list.
>
>
> On Mar 6, 2018, at 10:12 AM, Xavier Léauté <xav...@confluent.io>
> wrote:
>
> Julian, it looks like you or one of the mentors has to request the
> source code repos. Could you request a gitbox enabled repo?
>
> Based on https://incubator.apache.org/guides/transitioning_asf.html,
> for the initial migration, we need to involve infra to import the
>
> initial
>
> git history and grant them admin rights to the github repo.
>
> Charles, it also sound

Re: [druid-dev] Apache migration logistics

2018-04-16 Thread Gian Merlino
Oh cool, I didn't realize that. We should stick to https://druid.apache.org/
then.

Gian

On Mon, Apr 16, 2018 at 10:51 AM, Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> Quick note to say that apache sets up both http://druid.incubator.ap
> ache.org/  and http://druid.apache.org/
>  to point to the same place. I'd
> recommend always referencing and using  http://druid.apache.org/
>  for SEO and future-proofing purposes.
>
> Max
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/e8c99afa-b879-431b-a0b5-
> 64fab278b182%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


Re: Dependencies licenses Report

2018-04-24 Thread Gian Merlino
Do you mean the license headers? Those, I think, we shouldn't change until
the code is imported into Apache.

If it's possible to use Rat to audit dependency licenses without looking at
the license headers of our own files, that would still be useful at this
point.

On Tue, Apr 24, 2018 at 12:57 PM, Slim Bouguerra <bs...@apache.org> wrote:

>
> I Think the first step to use RAT is to reformat all the Druid code
> licenses.
> Any idea if this can be done now or we need some legal work to be done?
>
> On 2018/04/20 17:52:46, Slim Bouguerra <slim.bougue...@gmail.com> wrote:
> > As Suggested above, RAT is used as a first filter that does most of the
> > checking but it is not 100% enough.
> > The mvn site plugin is used to collect list of dependencies but it is not
> > enough as well.
> > They manually edit/create the Licenses/Notice files. It is done by
> > hand/a_human to avoid any glitch that an automatic tool will introduce
> and
> > to insure that someone has looked at it.
> > Seems like it is time consuming the first time but then it should be
> > incremental thus not that hard.
> >
> >
> >
> > On Wed, Apr 18, 2018 at 8:45 AM, Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
> >
> > > The main tool to use is Apache RAT. Definitely use that.
> > >
> > > One of the hardest tasks is getting the contents of LICENSE and NOTICE
> > > right. That is a manual task I’m afraid.
> > >
> > > Julian
> > >
> > > > On Apr 18, 2018, at 08:34, Gian Merlino <gianmerl...@gmail.com>
> wrote:
> > > >
> > > > Hi Slim,
> > > >
> > > > Do you know if ORC & Hive use this tool as part of their release
> process?
> > > > And if it's considered a good tool by itself for verifying we meet
> all of
> > > > the Apache licensing requirements, or if we'll need something else
> too?
> > > >
> > > >> On Tue, Apr 17, 2018 at 9:15 PM, Slim Bouguerra <bs...@apache.org>
> > > wrote:
> > > >>
> > > >> One of the question last dev synch was about the generation of
> > > dependency
> > > >> licenses.
> > > >> Some projects (ORC and Hive) use the maven site plugin that can
> > > generates
> > > >> reports with all the dependencies and licenses details.
> > > >> I have run it on Druid and this is how it looks for Druid Api
> Module.
> > > >> cmd
> > > >>
> > > >> mvn project-info-reports:dependencies
> > > >>
> > > >> The site directory can be found under target/site
> > > >> here is an example for one module
> > > >> https://drive.google.com/file/d/1P8R0kZjp8zP4WSOVrKdlJF7Xr8-
> > > >> OI7Oe/view?usp=sharing
> > > >>
> > > >> Also no fancy tools used to detect unwanted licenses, it is done
> while
> > > >> reviewing PR
> > > >>
> > > >>
> > > >>
> > > >> 
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > >> For additional commands, e-mail: dev-h...@druid.apache.org
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > For additional commands, e-mail: dev-h...@druid.apache.org
> > >
> > >
> >
> >
> > --
> >
> > B-Slim
> > ___/\/\/\___/\/\/\___/\/\/\___/\/\/\___/\/\/\___
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: Dependencies licenses Report

2018-04-24 Thread Gian Merlino
> you are sure about this? thought the donation paper work is signed right?

Not yet as far as I know; last I heard was a week or so ago.

On Tue, Apr 24, 2018 at 1:41 PM, Slim Bouguerra <slim.bougue...@gmail.com>
wrote:

>
> > On Apr 24, 2018, at 1:18 PM, Gian Merlino <g...@apache.org> wrote:
> >
> > Do you mean the license headers?
>
> Yes, did quick run and was complaining first about our headers and also
> some of the other files without any headers.
>
> > Those, I think, we shouldn't change until
> > the code is imported into Apache.
>
> you are sure about this? thought the donation paper work is signed right?
>
> >
> > If it's possible to use Rat to audit dependency licenses without looking
> at
> > the license headers of our own files, that would still be useful at this
> > point.
>
> Not sure but will look.
>
> >
> > On Tue, Apr 24, 2018 at 12:57 PM, Slim Bouguerra <bs...@apache.org>
> wrote:
> >
> >>
> >> I Think the first step to use RAT is to reformat all the Druid code
> >> licenses.
> >> Any idea if this can be done now or we need some legal work to be done?
> >>
> >> On 2018/04/20 17:52:46, Slim Bouguerra <slim.bougue...@gmail.com>
> wrote:
> >>> As Suggested above, RAT is used as a first filter that does most of the
> >>> checking but it is not 100% enough.
> >>> The mvn site plugin is used to collect list of dependencies but it is
> not
> >>> enough as well.
> >>> They manually edit/create the Licenses/Notice files. It is done by
> >>> hand/a_human to avoid any glitch that an automatic tool will introduce
> >> and
> >>> to insure that someone has looked at it.
> >>> Seems like it is time consuming the first time but then it should be
> >>> incremental thus not that hard.
> >>>
> >>>
> >>>
> >>> On Wed, Apr 18, 2018 at 8:45 AM, Julian Hyde <jhyde.apa...@gmail.com>
> >> wrote:
> >>>
> >>>> The main tool to use is Apache RAT. Definitely use that.
> >>>>
> >>>> One of the hardest tasks is getting the contents of LICENSE and NOTICE
> >>>> right. That is a manual task I’m afraid.
> >>>>
> >>>> Julian
> >>>>
> >>>>> On Apr 18, 2018, at 08:34, Gian Merlino <gianmerl...@gmail.com>
> >> wrote:
> >>>>>
> >>>>> Hi Slim,
> >>>>>
> >>>>> Do you know if ORC & Hive use this tool as part of their release
> >> process?
> >>>>> And if it's considered a good tool by itself for verifying we meet
> >> all of
> >>>>> the Apache licensing requirements, or if we'll need something else
> >> too?
> >>>>>
> >>>>>> On Tue, Apr 17, 2018 at 9:15 PM, Slim Bouguerra <bs...@apache.org>
> >>>> wrote:
> >>>>>>
> >>>>>> One of the question last dev synch was about the generation of
> >>>> dependency
> >>>>>> licenses.
> >>>>>> Some projects (ORC and Hive) use the maven site plugin that can
> >>>> generates
> >>>>>> reports with all the dependencies and licenses details.
> >>>>>> I have run it on Druid and this is how it looks for Druid Api
> >> Module.
> >>>>>> cmd
> >>>>>>
> >>>>>> mvn project-info-reports:dependencies
> >>>>>>
> >>>>>> The site directory can be found under target/site
> >>>>>> here is an example for one module
> >>>>>> https://drive.google.com/file/d/1P8R0kZjp8zP4WSOVrKdlJF7Xr8-
> >>>>>> OI7Oe/view?usp=sharing
> >>>>>>
> >>>>>> Also no fancy tools used to detect unwanted licenses, it is done
> >> while
> >>>>>> reviewing PR
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 
> >> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@druid.apache.org
> >>>>>>
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> >>>> For additional commands, e-mail: dev-h...@druid.apache.org
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> B-Slim
> >>> ___/\/\/\___/\/\/\___/\/\/\___/\/\/\___/
> \/\/\___
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> >> For additional commands, e-mail: dev-h...@druid.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: [druid-dev] Apache migration logistics

2018-04-23 Thread Gian Merlino
Hi Parag,

You should be able to do it by checking out the repo at
https://svn.apache.org/repos/asf/incubator/public and then adding yourself
to the file trunk/content/projects/druid.xml.

On Wed, Apr 18, 2018 at 11:25 AM, Parag Jain <pjai...@oath.com.invalid>
wrote:

>  I got my apache id recently, its pjain1
> Not sure how to add it to the page - http://incubator.apache.org/
> projects/druid.html. If anyone knows, can you please add it.
> Thanks,Parag
>
> On Monday, March 12, 2018, 4:49:41 PM CDT, Gian Merlino <g...@imply.io>
> wrote:
>
>  Committers: please,
> 1) If you don't have an apache id already, fill out an ICLA:
> https://www.apache.org/ dev/new-committers-guide.html#
> guide-for-new-committers and then post here and hopefully someone can
> figure out how to get you an id?
> 2) When you have an id, post it here if it's not in
> http://incubator.apache.org/projects/druid.html so someone can figure out
> how to add you to that, and then also try to sign up to
> private-subscr...@druid.apache.org (+ dev-subscr...@druid.apache.org
> which you should be on already). If you can't, then also post here, and
> hopefully someone can figure _that_ out.
>
> Gian
> On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté <xav...@confluent.io>
> wrote:
>
> This thread is already going to both lists, and it looks like responses
> automatically go to both. Would be good to check what happens if we
> subscribe dev@ to the google group. If responding from the apache list
> doesn't automatically add the google group as well, it will be hard to keep
> the group useful.
> Agree with Julian a cutoff is necessary anyway, since the google group
> inherently becomes less useful over time, as some information only ends up
> in the apache list.
> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa <nishant.mon...@gmail.com>
> wrote:
>
> We can register dev@druid.apache.org and us...@druid.apache.org as a user
> in druid user groups so that going forward any mails that are sent to druid
> google groups are also received on the apache lists and is on the record.
> This would be to bridge the gap during the migration only.
> @Julian, I go ahead and try setting this up, If this seems reasonable ?
> On Sat, 10 Mar 2018 at 00:09 Julian Hyde <jhyde.apa...@gmail.com> wrote:
>
> I don’t know. I don’t think it’s easy.
>
>
> On Mar 9, 2018, at 7:31 AM, Roman Leventov <roman.leventov@metamarkets.
> com> wrote:
> Could archives of druid-dev and druid-users mailing lists be transferred
> to the new lists?
> On Thu, Mar 8, 2018 at 8:48 AM, Julian Hyde <jh...@apache.org> wrote:
>
> I’m working on it. It turns that I don’t have sufficient karma to create a
> git repo, so I’ve put in a request on the incubator list.
>
>
> On Mar 6, 2018, at 10:12 AM, Xavier Léauté <xav...@confluent.io> wrote:
> Julian, it looks like you or one of the mentors has to request the source
> code repos. Could you request a gitbox enabled repo?
> Based on https://incubator.apache. org/guides/transitioning_asf. html,
> for the initial migration, we need to involve infra to import the initial
> git history and grant them admin rights to the github repo.
> Charles, it also sound like we won't be able to do any code migration util
> legal signs off on the software grant, could you drive that?
> On Mon, Mar 5, 2018 at 12:52 PM Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
>
> The dev, users and private mailing lists now exist. You can see the
> archives:
> * https://lists.apache.org/ list.html?dev@druid.apache.org*
> https://lists.apache.org/ list.html?users@druid.apache. org*
> https://lists.apache.org/ list.html?private@druid. apache.org
>
> To see the last of these, you need to log in.
> There will also be archives at https://mail-archives.
> apache.org/mod_mbox/druid-dev/  etc. (you might need to wait a few
> minutes for the archiver to catch up).
> If you are an initial committer or mentor, you are a member of the Druid
> PPMC and you must be on both dev and private lists now. Send a message
> to dev-subscribe@druid.apache. org and private-subscribe@ geode.apache.org
> .
> Everyone else is welcome to join dev and/or users.
> Julian
>
> On Mar 5, 2018, at 11:58 AM, Nishant Bangarwa <nishant.mon...@gmail.com>
> wrote:
> Apache Incubator Superset is another example which uses github issues -
> https://github.com/apache/ incubator-superset/issues
> For Superset it works as all the github issue interactions are captured in
> ASF owned mailing list via Gitbox Integration. See -
> https://lists.apache.org/list. html?d...@superset.apache.org
>
> For Druid, If everyone agrees we can also choose to capture interactions
> on github issues at a

Re: Dev list migration

2018-04-24 Thread Gian Merlino
Today (well, yesterday) is the day we had decided on for migrating the dev
list. If there are no objections I'll update the community page, send out
another note to the old list, and put the old list into read only mode.

On Tue, Apr 17, 2018 at 8:28 PM, Himanshu <g.himan...@gmail.com> wrote:

> +1
>
> On Tue, Apr 17, 2018 at 3:20 PM, Nishant Bangarwa <
> nbanga...@hortonworks.com
> > wrote:
>
> > +1
> > --
> > Nishant Bangarwa
> > Hortonworks
> > (M): +91-9729200044
> >
> >
> >
> >
> >
> >
> >
> > On 4/17/18, 10:19 PM, "Jonathan Wei" <jon...@apache.org> wrote:
> >
> > >+1
> > >
> > >On 2018/04/17 18:09:37, David Lim <david...@apache.org> wrote:
> > >> +1
> > >>
> > >> On Tue, Apr 17, 2018 at 11:49 AM, Gian Merlino <g...@apache.org>
> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > In the dev sync today there was some general agreement around
> > migrating the
> > >> > dev list to Apache next week. Please +1 or -1 in this thread as
> > desired.
> > >> >
> > >> > I think ideally we'd want an autoresponder but I don't see a way to
> > set
> > >> > one. So the concrete plan I'd propose instead is to do the following
> > next
> > >> > Monday, April 23,
> > >> >
> > >> > 1) Post a message to the list that we have migrated to
> > >> > dev@druid.apache.org
> > >> > 2) Sticky that post on
> > >> > https://groups.google.com/forum/#!forum/druid-development
> > >> > 3) Update the list link on the web site at
> http://druid.io/community/
> > >> > 4) Disable posting on druid-developm...@googlegroups.com
> > >> >
> > >> > This message has been cross posted to both lists, but please reply
> on
> > the
> > >> > Apache list. If you haven't signed up for it yet, you can do that by
> > >> > emailing dev-subscr...@druid.apache.org.
> > >> >
> > >>
> > >
> > >-
> > >To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > >For additional commands, e-mail: dev-h...@druid.apache.org
> > >
> >
>


Re: [druid-dev] consider doing a 0.12.1 release

2018-04-17 Thread Gian Merlino
It looks like https://github.com/druid-io/druid/pull/5654 is the last to
backport.

Shall we do a release candidate after that?

This will probably end up being a non-Apache release when all is said and
done, since the timelines will probably not line up - we want to get the
bug fixes out quickly, but we are not yet ready to do Apache releases from
either an IP or technical standpoint.

On Mon, Apr 9, 2018 at 9:07 PM, Gian Merlino <gianmerl...@gmail.com> wrote:

> I backported 5504 and 5587. I skipped 5374 since it turns out it isn't
> relevant to 0.12.0 -- it fixes something introduced since then -- so I
> moved that back to 0.13.0.
>
> On Mon, Apr 9, 2018 at 1:56 PM, Gian Merlino <g...@apache.org> wrote:
>
>> I raised backport PRs for everything on my list except for these three,
>> since they had some (minor) conflicts and I was using a backport script
>> that bails out if it sees any conflicts:
>>
>> - https://github.com/druid-io/druid/pull/5504
>> - https://github.com/druid-io/druid/pull/5587
>> - https://github.com/druid-io/druid/pull/5374
>>
>> I will take a closer look at them later if nobody else gets to them.
>>
>> Nishant, I added https://github.com/druid-io/druid/pull/5596 to the
>> milestone.
>>
>> Prashant, I added https://github.com/druid-io/druid/pull/5611 to the
>> milestone.
>>
>> On Mon, Apr 9, 2018 at 1:10 PM, Prashant Deva <prashant.d...@gmail.com>
>> wrote:
>>
>>> also for the list:
>>>
>>> - compact task throws exception #5611
>>>
>>> i dont have permission to add to gian's github milestone
>>>
>>> Prashant
>>>
>>> On Mon, Apr 9, 2018 at 12:26 PM, Gian Merlino <gianmerl...@gmail.com>
>>> wrote:
>>>
>>> > Here we go: https://github.com/druid-io/druid/milestone/26. Please add
>>> > stuff if it makes sense (high importance / low risk bug fixes).
>>> >
>>> > On Mon, Apr 9, 2018 at 12:14 PM, Gian Merlino <gianmerl...@gmail.com>
>>> > wrote:
>>> >
>>> > > It looks like there's a lot of support for doing a 0.12.1… I'll make
>>> a
>>> > > milestone in GH.
>>> > >
>>> > > On Mon, Apr 9, 2018 at 12:12 PM, Slim Bouguerra <bs...@apache.org>
>>> > wrote:
>>> > >
>>> > >> +1 especially there is some instability bugs.
>>> > >>
>>> > >> On Apr 9, 2018, at 11:57 AM, Gian Merlino <g...@apache.org> wrote:
>>> > >>
>>> > >> My feeling is that #3 and #2 are borderline, but #1 definitely
>>> warrants
>>> > a
>>> > >> new release. Personally I have seen it occur at least a half dozen
>>> > times,
>>> > >> and I had been thinking about proposing a Druid 0.12.1 release, so
>>> I'm
>>> > >> glad
>>> > >> you brought it up.
>>> > >>
>>> > >> If we do 0.12.1, it would be another non-ASF release (we haven't
>>> got the
>>> > >> ASF process set up yet, and are not likely to have it set up in
>>> time) so
>>> > >> we
>>> > >> should notify the incubator folks about it.
>>> > >>
>>> > >> I would also consider including these fixes in 0.12.1:
>>> > >>
>>> > >> - DoublesSketchModule: Fix serde for DoublesSketchMergeAggregatorFa
>>> > ctory.
>>> > >> (#5587)
>>> > >> - ArrayAggregation: Use long to avoid overflow (#5544)
>>> > >> - Respect forceHashAggregation in queryContext (#5533)
>>> > >> - Fix indexTask to respect forceExtendableShardSpecs (#5509)
>>> > >> - Add overlord unsecured paths to coordinator when using combined
>>> > service
>>> > >> (#5579)
>>> > >> - Fix SQLMetadataSegmentManager to allow succesive start and stop
>>> > (#5554)
>>> > >> - Fix supervisor tombstone auth handling (#5504)
>>> > >> - Authorize supervisor history instead of current active
>>> supervisors for
>>> > >> supervisor history API (#5501)
>>> > >> - Fix round robining in router. (#5500)
>>> > >> - SegmentMetadataQuery: Fix default interval handling. (#5489)
>>> > >> - Log exceptions thrown before persist() for indexing tasks (#5374)
>>> > >> - More memory limiting for HttpPostEmitter (#5300)
>>> > >> - pass c

Re: Tranquility Future

2018-04-17 Thread Gian Merlino
Is anyone interested in doing the work to migrate it to Apache? If so, I
think we should definitely do that.

In general I have been more interested in contributing to the Kafka
indexing service lately since it has nicer properties (exactly once, no
extra processes, can read late data). But Tranquility is still valuable
from a user perspective when best-effort is ok and when you don't have a
Kafka server. So I hesitate to get rid of it completely. But of course - to
actually migrate it to Apache, someone will have to step up and do the work.

On Tue, Apr 17, 2018 at 5:56 PM, Slim Bouguerra  wrote:

> Hi Devs,
> am not sure if i have missed the discussion but am wondering what is the
> Future of Tranquility after moving out Druid of Druid-io?
> Will tranquility stay under https://github.com/druid-io/tranquility/ ?
> in the future is this Repo https://github.com/druid-io/tranquility/
> manageable by the Druid community or not anymore?
> Any thought about the future of Tranquility ?
>
> Thanks !
>


Re: Making sure Github / Gitbox is setup properly

2018-04-18 Thread Gian Merlino
Those are good tips… thanks!

Does it make sense to send every GitHub notification to g...@druid.apache.org
or is there a reason you suggested splitting out issues and PRs?

On Wed, Apr 18, 2018 at 8:44 AM, Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> Hi dev@druid.apache.org,
>
> Quick note to suggest making sure that Apache Infra is setting up things
> the right way. Make sure they setup Github notifications to go to `
> iss...@druid.apache.org` and `p...@druid.apache.org` and don't spam dev@.
> It's a requirement to have Github notifications sent to some ASF mailing
> list in order to use them. I recommend GH issues over Jira personally.
>
> I'm sending this because they made that mistake for Superset and our `dev@
> `
> is spammed and unusable, now we have to ask people to take down their email
> filters...
>
> Also I see that we are using 3rd party services like Travis, Teamcity and
> Coveralls. We'll have to open tickets with Apache INFRA
> https://issues.apache.org/jira/browse/INFRA to make sure they set those up
> as none of the committers can have `Admin` access to the repo. Travis is
> ok, but some of the services require Org-level perms which Apache INFRA
> won't give. If I remember well Coveralls isn't supported, so we may have to
> move to Codecov or whatever else works and is supported.
>
> It's a bit painful at first but it's all well worth it in the end!
>
> Welcome to Apache! :)
>
> Max
>


Re: [druid-dev] Apache migration logistics

2018-03-27 Thread Gian Merlino
Hi, today I found myself wondering about migration of source repos.

Does anyone know if we've got our GitBox git repo set up? And who we need
to talk to about getting the druid-io repos transferred to the apache org
in github such that we can do source control in an Apache-certified-okay
way? It sounded like we are going to be able to keep using GitHub PRs and
issues. So I'm hoping we can do this process:
https://help.github.com/articles/about-repository-transfers/ which lets us
keep all the issues, watchers, & stars intact.

On Mon, Mar 12, 2018 at 3:25 PM, Xavier Léauté <xav...@confluent.io> wrote:

> FYI, to update your information on the status page you need to check out
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/projects/
> with
> your Apache credentials and update the druid.xml file in that directory.
>
> On Mon, Mar 12, 2018 at 2:49 PM Gian Merlino <g...@imply.io> wrote:
>
> > Committers: please,
> >
> > 1) If you don't have an apache id already, fill out an ICLA:
> > https://www.apache.org/dev/new-committers-guide.html#
> guide-for-new-committers and
> > then post here and hopefully someone can figure out how to get you an id?
> >
> > 2) When you have an id, post it here if it's not in
> > http://incubator.apache.org/projects/druid.html so someone can figure
> out
> > how to add you to that, and then also try to sign up to
> > private-subscr...@druid.apache.org (+ dev-subscr...@druid.apache.org
> > which you should be on already). If you can't, then also post here, and
> > hopefully someone can figure _that_ out.
> >
> > Gian
> >
> > On Fri, Mar 9, 2018 at 11:28 AM, Xavier Léauté <xav...@confluent.io>
> > wrote:
> >
> >> This thread is already going to both lists, and it looks like responses
> >> automatically go to both. Would be good to check what happens if we
> >> subscribe dev@ to the google group. If responding from the apache list
> >> doesn't automatically add the google group as well, it will be hard to
> keep
> >> the group useful.
> >>
> >> Agree with Julian a cutoff is necessary anyway, since the google group
> >> inherently becomes less useful over time, as some information only ends
> up
> >> in the apache list.
> >>
> >> On Fri, Mar 9, 2018 at 11:14 AM Nishant Bangarwa <
> >> nishant.mon...@gmail.com> wrote:
> >>
> >>> We can register dev@druid.apache.org and us...@druid.apache.org as a
> >>> user in druid user groups so that going forward any mails that are
> sent to
> >>> druid google groups are also received on the apache lists and is on the
> >>> record. This would be to bridge the gap during the migration only.
> >>>
> >>> @Julian, I go ahead and try setting this up, If this seems reasonable ?
> >>>
> >>> On Sat, 10 Mar 2018 at 00:09 Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
> >>>
> >>>> I don’t know. I don’t think it’s easy.
> >>>>
> >>>>
> >>>> On Mar 9, 2018, at 7:31 AM, Roman Leventov <
> >>>> roman.leven...@metamarkets.com> wrote:
> >>>>
> >>>> Could archives of druid-dev and druid-users mailing lists be
> >>>> transferred to the new lists?
> >>>>
> >>>> On Thu, Mar 8, 2018 at 8:48 AM, Julian Hyde <jh...@apache.org> wrote:
> >>>>
> >>>>> I’m working on it. It turns that I don’t have sufficient karma to
> >>>>> create a git repo, so I’ve put in a request on the incubator list.
> >>>>>
> >>>>>
> >>>>> On Mar 6, 2018, at 10:12 AM, Xavier Léauté <xav...@confluent.io>
> >>>>> wrote:
> >>>>>
> >>>>> Julian, it looks like you or one of the mentors has to request the
> >>>>> source code repos. Could you request a gitbox enabled repo?
> >>>>>
> >>>>> Based on https://incubator.apache.org/guides/transitioning_asf.html,
> >>>>> for the initial migration, we need to involve infra to import the
> initial
> >>>>> git history and grant them admin rights to the github repo.
> >>>>>
> >>>>> Charles, it also sound like we won't be able to do any code migration
> >>>>> util legal signs off on the software grant, could you drive that?
> >>>>>
> >>>>> On Mon, Mar 5, 2018 at 12:52 PM Julian Hyde <jhyde.apa...@gmail.com>
> >>>>> wro

Re: [druid-dev] any reason to still keep overlord as separate node?

2018-03-31 Thread Gian Merlino
Hi Prashant,

The only issue that I can think of is that in some (super large) clusters,
the coordinator and overlord can both be pretty demanding in terms of
memory and it helps for scalability to have them be separate. But this is
not the common case - most clusters are smaller or medium sized. So it
makes sense for the default to be combining them. I would support a patch
that changed the defaults and updated the docs accordingly.

Btw, since we are trying to migrate the dev mailing list to Apache, please
cross post this sort of thing with dev@druid.apache.org, or even only post
to that list.

Gian

On Sat, Mar 31, 2018 at 9:42 AM, Prashant Deva 
wrote:

> i feel atleast the documentation should be written to assume that
> overlord+coordinator is the default config and separate overlord is the
> 'legacy' one.
>
> is there any actual issues holding keeping overlord as separate node?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/druid-development/2d22212b-23dc-4654-9b48-df8439cb62ad%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


Re: [druid-dev] Questions about Public API

2018-03-31 Thread Gian Merlino
I cross-posted my reply, so this thread is on both lists now.

On Fri, Mar 30, 2018 at 3:32 PM, Jihoon Son <ghoon...@gmail.com> wrote:

> Thanks.
>
> Should I send this email to the Apache mailing list as well?
>
> Jihoon
>
> 2018년 3월 30일 (금) 오후 1:58, Gian Merlino <g...@imply.io>님이 작성:
>
> > Hi Jihoon,
> >
> > The javadoc for @PublicApi explains the intent. They can change, but not
> > in a minor release. There should also
> > be some consensus around changing them. Once we are 1.0, then we should
> > only change PublicApis when going
> > to 2.0.
> >
> > Btw, since we are trying to migrate the dev mailing list to Apache,
> please
> > cross post this sort of thing with
> > d...@druid.incubator.apache.org, or even only post to that list.
> >
> > From the javadoc:
> >
> > > Signifies that the annotated entity is a public API for extension
> > authors. Public APIs may change in breaking ways
> > > only between major Druid release lines (e.g. 0.10.x -> 0.11.0), but
> > otherwise must remain stable. Public APIs may
> > > change at any time in non-breaking ways, however, such as by adding new
> > fields, methods, or constructors.
> > >
> > > Note that interfaces annotated with {@code PublicApi} but not with
> > {@link ExtensionPoint} are not meant to be
> > > subclassed in extensions. In this case, the annotation simply signifies
> > that the interface is stable for callers.
> > > In particular, since it is not meant to be subclassed, new non-default
> > methods may be added to an interface and
> > > new abstract methods may be added to a class.
> > >
> > > If a class or interface is annotated, then all public and protected
> > fields, methods, and constructors that class
> > > or interface are considered stable in this sense. If a class is not
> > annotated, but an individual field, method, or
> > > constructor is annotated, then only that particular field, method, or
> > constructor is considered a public API.
> > >
> > > Classes, fields, method, and constructors _not_ annotated with {@code
> > @PublicApi} may be modified or removed
> > > in any Druid release, unless they are annotated with {@link
> > ExtensionPoint} (which implies they are a public API
> > > as well).
> >
> > Gian
> >
> > On Thu, Mar 29, 2018 at 5:45 PM, Jihoon Son <jihoon...@apache.org>
> wrote:
> >
> >> Hi folks,
> >>
> >> I wonder what's the exact meaning of the 'PublicApi' annotation. From
> >> https://github.com/druid-io/druid/pull/4433,
> >>
> >> > @PublicApi which signifies something you're not meant to subclass, but
> >> that you can use for implementation.
> >>
> >> I can also see some methods can't be deleted because they are in some
> >> classes annotated with @PublicApi. Here is an example in TaskRunner.
> >>
> >> /**
> >>  * Start the state of the runner.
> >>  *
> >>  * This method is unused, but TaskRunner is {@link PublicApi}, so we
> >> cannot remove it.
> >>  */
> >> @SuppressWarnings("unused")
> >> void start();
> >>
> >> Does this mean @PublicApi classes must change in a backward-compatible
> >> way? Or can we change in a non-compatible way and call out when we
> release?
> >>
> >> If this is not defined yet, it would be good to start a discussion on
> >> this.
> >>
> >> Best,
> >> Jihoon
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "Druid Development" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to druid-development+unsubscr...@googlegroups.com.
> >> To post to this group, send email to druid-developm...@googlegroups.com
> .
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/druid-development/CACZfFK4_xEG-
> 6JSdqARh3SRRVWCgoSVpK3Z48rqUhOTuqS-x3w%40mail.gmail.com
> >> <https://groups.google.com/d/msgid/druid-development/CACZfFK4_xEG-
> 6JSdqARh3SRRVWCgoSVpK3Z48rqUhOTuqS-x3w%40mail.gmail.com?utm_
> medium=email_source=footer>
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Druid Development" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to druid-development+unsubscr...@googlegroups.com.
> > To post to this group, send email to druid-developm...@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/druid-development/
> CACZNdYAPj7o4DF3AekgM1Bv77gQV5f6JWikknjtZM5x4QtT2ng%40mail.gmail.com
> > <https://groups.google.com/d/msgid/druid-development/
> CACZNdYAPj7o4DF3AekgM1Bv77gQV5f6JWikknjtZM5x4QtT2ng%40mail.
> gmail.com?utm_medium=email_source=footer>
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >
>


Re: [druid-dev] Questions about Public API

2018-04-01 Thread Gian Merlino
Hi Jihoon,

The javadoc for @PublicApi explains the intent. They can change, but not in
a minor release. There should also
be some consensus around changing them. Once we are 1.0, then we should
only change PublicApis when going
to 2.0.

Btw, since we are trying to migrate the dev mailing list to Apache, please
cross post this sort of thing with
d...@druid.incubator.apache.org, or even only post to that list.

>From the javadoc:

> Signifies that the annotated entity is a public API for extension
authors. Public APIs may change in breaking ways
> only between major Druid release lines (e.g. 0.10.x -> 0.11.0), but
otherwise must remain stable. Public APIs may
> change at any time in non-breaking ways, however, such as by adding new
fields, methods, or constructors.
>
> Note that interfaces annotated with {@code PublicApi} but not with {@link
ExtensionPoint} are not meant to be
> subclassed in extensions. In this case, the annotation simply signifies
that the interface is stable for callers.
> In particular, since it is not meant to be subclassed, new non-default
methods may be added to an interface and
> new abstract methods may be added to a class.
>
> If a class or interface is annotated, then all public and protected
fields, methods, and constructors that class
> or interface are considered stable in this sense. If a class is not
annotated, but an individual field, method, or
> constructor is annotated, then only that particular field, method, or
constructor is considered a public API.
>
> Classes, fields, method, and constructors _not_ annotated with {@code
@PublicApi} may be modified or removed
> in any Druid release, unless they are annotated with {@link
ExtensionPoint} (which implies they are a public API
> as well).

Gian

On Thu, Mar 29, 2018 at 5:45 PM, Jihoon Son  wrote:

> Hi folks,
>
> I wonder what's the exact meaning of the 'PublicApi' annotation. From
> https://github.com/druid-io/druid/pull/4433,
>
> > @PublicApi which signifies something you're not meant to subclass, but
> that you can use for implementation.
>
> I can also see some methods can't be deleted because they are in some
> classes annotated with @PublicApi. Here is an example in TaskRunner.
>
> /**
>  * Start the state of the runner.
>  *
>  * This method is unused, but TaskRunner is {@link PublicApi}, so we
> cannot remove it.
>  */
> @SuppressWarnings("unused")
> void start();
>
> Does this mean @PublicApi classes must change in a backward-compatible
> way? Or can we change in a non-compatible way and call out when we release?
>
> If this is not defined yet, it would be good to start a discussion on this.
>
> Best,
> Jihoon
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/CACZfFK4_xEG-6JSdqARh3SRRVWCgoSVpK3Z48rqUhO
> TuqS-x3w%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


Podling report - April 2018

2018-04-03 Thread Gian Merlino
Hi Druids,

Our podling report is due tomorrow and I am starting work on it.

I will post a draft here before editing the incubator wiki.


Re: [druid-dev] Skipping Empty Buckets feature

2018-04-03 Thread Gian Merlino
Hi Suhas,

How would you expect the results to come back if groupBy did _not_ skip
empty time buckets, and one day was empty? Should all the dimensions be
null?

Btw, since we are migrating the dev mailing list to Apache, please cross
post this sort of thing with dev@druid.apache.org, or even only post to
that list.

Gian

On Tue, Apr 3, 2018 at 4:36 AM, Suhas  wrote:

> This is straight from the docs,
>>
>>
>> 
>> Note that all the empty buckets are discarded.
>
>
> This could actually be a good thing. I think there should an option in
> *context *whether or not to skip empty buckets even in the group by
> queries. Personally, I found myself in situations where many queries I
> wanted to group by some dimensions and have a day's granularity, should've
> included all buckets in the result. I don't see why this is a bad idea?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscr...@googlegroups.com.
> To post to this group, send email to druid-developm...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/12b81b6f-38f4-4725-9482-
> 3acd1993ce85%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


Re: Podling report - April 2018

2018-04-03 Thread Gian Merlino
Hi Druids,

Here is a draft of our podling report for April 2018. Let me know what you
think. It's due tomorrow EOD so I will post it tomorrow.



Three most important issues to address in the move towards graduation:

 1. Complete SGA for current sources and ICLAs for current committers.
 2. Move the source code and website to Apache infrastructure.
 3. Plan and execute our first Apache release.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

- None.

How has the community developed since the last report?

- We have moved development discussions to our Apache dev mailing list.
- A healthy, constant flow of bug fixes, quality improvements and new
features
  are still ongoing on https://github.com/druid-io/druid.

How has the project developed since the last report?

- Since the last report there have been 36 commits from 18 individuals.
- We have released Druid 0.12.0 (outside the Incubator). We are optimistic
  that our next release will be done as an Apache release.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [X] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [ ] Nearing graduation
  [ ] Other:

Date of last release:

- Druid 0.12.0 on 2018-03-06 (non-Apache release)
- No official release yet since beginning Apache Incubation

When were the last committers or PPMC members elected?

- Project is still functioning with the initial set of committers.

On Tue, Apr 3, 2018 at 11:43 AM, Gian Merlino <g...@apache.org> wrote:

> Hi Druids,
>
> Our podling report is due tomorrow and I am starting work on it.
>
> I will post a draft here before editing the incubator wiki.
>


Re: Considering 0.12.2 release

2018-06-21 Thread Gian Merlino
I think we're hoping that the next major (0.13) will also be our first
Apache release. If it ends up taking too long we might want to rethink that
but hopefully it won't (see the recently started thread about the next
step, migrating the Github repos).

On Wed, Jun 20, 2018 at 10:42 PM Roman Leventov 
wrote:

> Why won't release 13.0 instead (as non-Apache)?
>
> On Thu, 21 Jun 2018, 08:18 Gian Merlino,  wrote:
>
> > +1
> >
> > I would suggest adding these two as well.
> >
> > https://github.com/druid-io/druid/pull/5878 - Fix inefficient available
> > segment cache population in SQLMetadataSegmentManager.
> > https://github.com/druid-io/druid/pull/5873 - HdfsDataSegmentPusher:
> Close
> > tmpIndexFile before copying it.
> >
> > On Tue, Jun 12, 2018 at 11:34 AM Prashant Deva 
> > wrote:
> >
> > > +1
> > >
> > > On Fri, Jun 8, 2018 at 3:37 PM Jihoon Son 
> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > we have a couple of bug fix PRs available and some of them fix
> > regression
> > > > bugs.
> > > >
> > > > Here is the list of currently available bug fix PRs which are not
> > > included
> > > > in 0.12.1.
> > > >
> > > > Regression bug fixes
> > > > - https://github.com/druid-io/druid/pull/5858
> > > > - https://github.com/druid-io/druid/pull/5805
> > > > - https://github.com/druid-io/druid/pull/5807
> > > >
> > > > Non-regression bug fixes
> > > > - https://github.com/druid-io/druid/pull/5850
> > > > - https://github.com/druid-io/druid/pull/5815
> > > > - https://github.com/druid-io/druid/pull/5856
> > > >
> > > > I think it's worth to make another release for users.
> > > >
> > > > Welcome any idea.
> > > >
> > > > Jihoon
> > > >
> > > --
> > > Prashant
> > >
> >
>


Re: Netty 4.1.x

2018-10-05 Thread Gian Merlino
It sounds good to me.

BTW, we still use netty 3.x for http-client and so it's pretty pervasive.
It coexists with netty 4.x (the packages are different) so there isn't a
conflict at that level. But if we wanted to, like, _fully_ upgrade to netty
4.1.x then it'd involve porting over the http-client.

On Fri, Oct 5, 2018 at 11:20 AM Charles Allen
 wrote:

> https://github.com/apache/incubator-druid/pull/6417 proposes upgrading to
> netty 4.1.x
>
> A lot of the prior issues are likely resolved. Things like java-util are
> part of the druid repository now, the dependent libraries which were still
> using 4.0.x are upgraded (in the PR) to ones using 4.1.x, and Spark's
> laster major version (2.3.x) has netty 4.1
>
> I propose giving netty 4.1.x another shot.
>
> Sound good?
>
> Charles Allen
>


Re: ICLA

2018-10-16 Thread Gian Merlino
That is the understanding we've been applying to Druid itself. My
understanding of ASF policy is that committers need ICLAs, and other
contributors only need "clear intent to contribute", which is established
if the PR author == the code author.

On Tue, Oct 16, 2018 at 12:18 PM Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> Hey,
>
> While I'm maintaining PyDruid, I'm wondering whether I should still be
> asking all contributors for an ICLA.  From my understanding, the ASF
> requires an ICLA only for committers, not all contributors (is that
> right?).
>
> Max
>


Re: [VOTE] Tranquility 0.8.3 release

2018-10-16 Thread Gian Merlino
Tranquility isn't an Apache project (yet?). It is one of Druid's companion
projects, like pydruid and RDruid, that live in separate git repos with an
independent release process. What is being voted on is the latest commit in
github. Unlike Druid we have typically not done release candidates or very
formal release processes in general for the companion projects. They have a
smaller feel to them and some of them have just a single maintainer, or
even no active maintainer.

It might make sense to migrate some or all them to Apache at some point.
There hasn't been much discussion about it so I am not sure if there is
really consensus on what to do about them. For now I guess we are
continuing with the 'classic' process for them.

On Mon, Oct 15, 2018 at 8:21 PM Julian Hyde  wrote:

> Can someone please clarify what is going on here. Am I correct that
> Tranquility is not an Apache project? Who is allowed to vote for this
> release - Druid PPMC members?
>
> What is being voted upon? A particular set of artifacts to be released,
> the latest commit in github, or something else? (If it’s not an Apache
> release, I guess I shouldn’t complain that the vote doesn’t follow Apache
> protocol.)
>
> Julian
>
>
> > On Oct 15, 2018, at 7:51 PM, David Lim  wrote:
> >
> > +1
> >
> > On Mon, Oct 15, 2018 at 6:08 PM Fangjin Yang  wrote:
> >
> >> +1
> >>
> >> On Mon, Oct 15, 2018 at 4:40 PM Jihoon Son 
> wrote:
> >>
> >>> +1
> >>>
> >>> Thanks Jon!
> >>>
> >>> Jihoon
> >>>
> >>> On Tue, Oct 16, 2018 at 5:22 AM Jonathan Wei 
> wrote:
> >>>
>  Hi all,
> 
>  I'd like to open a vote for a new Tranquility release, 0.8.3. The new
>  release would have the following improvements and bug fixes:
> 
>  Improvements:
>  * Update Curator and Scala. (#213)
>  * support rollup function in druid 0.9.2 (#210)
>  * Allow customization of zookeeper path through properties. (#215)
>  * Update MMX libraries and replace scala_tools.time (#220)
>  * Exclude deps with *GPL licenses. (#223)
>  * expose sslContext and prefer tlsPort if present (#257)
>  * Support Basic HTTP auth with druid, TLS support for server (#277)
> 
>  Bug fixes:
>  * remove data type and input row parser type binding (#193)
>  * Change default host/port for DruidNode and FlinkBeam (#266)
>  * Thread-safe samza BeamProducer (#228)
> 
>  Notably, this release would allow Tranquility to work with TLS-secured
>  Druid clusters and support Basic HTTP user/pass authentication.
> 
>  Thanks,
>  Jon
> 
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: Metrics updates in Release Notes

2018-10-31 Thread Gian Merlino
Why not also tag those with "Release Notes"? It makes it a lot easier for
release managers to do their jobs if they just have to look at one label.
(Or two, I guess: "release notes" and "incompatible". But I would be down
to merge them.)

On Wed, Oct 31, 2018 at 9:26 AM David Lim  wrote:

> Thanks Roman. I'm helping with the release this time so I will check the
> PRs with that label and include them in the release notes as appropriate.
>
> As far as I know, there isn't any document like that, but I agree it would
> be quite useful.
>
> On Wed, Oct 31, 2018 at 9:04 AM Roman Leventov 
> wrote:
>
> > It's suggested that the person that prepares Druid Release Notes (I think
> > it's Jon usually) goes through all PRs labelled "Area - Metrics/Event
> > Emitting" (
> >
> >
> https://github.com/apache/incubator-druid/pulls?q=is%3Apr+sort%3Aupdated-desc+label%3A%22Area+-+Metrics%2FEvent+Emitting%22+is%3Aclosed+milestone%3A0.13.0
> > )
> > along with "Release Notes", to present this information in the release
> > notes.
> >
> > BTW I wonder is the a document in the repository or elsewhere that
> > describes the release process?
> >
>


Re: Metrics updates in Release Notes

2018-10-31 Thread Gian Merlino
I don't think we have a doc about how to do a release, but yeah it would be
great to have it. Dave, would you be able to put it together while you
manage this release? I am sure it will differ substantially from what we've
done in the past, because of the new Apache-ified stuff.

On Wed, Oct 31, 2018 at 10:07 AM Gian Merlino  wrote:

> Why not also tag those with "Release Notes"? It makes it a lot easier for
> release managers to do their jobs if they just have to look at one label.
> (Or two, I guess: "release notes" and "incompatible". But I would be down
> to merge them.)
>
> On Wed, Oct 31, 2018 at 9:26 AM David Lim  wrote:
>
>> Thanks Roman. I'm helping with the release this time so I will check the
>> PRs with that label and include them in the release notes as appropriate.
>>
>> As far as I know, there isn't any document like that, but I agree it would
>> be quite useful.
>>
>> On Wed, Oct 31, 2018 at 9:04 AM Roman Leventov 
>> wrote:
>>
>> > It's suggested that the person that prepares Druid Release Notes (I
>> think
>> > it's Jon usually) goes through all PRs labelled "Area - Metrics/Event
>> > Emitting" (
>> >
>> >
>> https://github.com/apache/incubator-druid/pulls?q=is%3Apr+sort%3Aupdated-desc+label%3A%22Area+-+Metrics%2FEvent+Emitting%22+is%3Aclosed+milestone%3A0.13.0
>> > )
>> > along with "Release Notes", to present this information in the release
>> > notes.
>> >
>> > BTW I wonder is the a document in the repository or elsewhere that
>> > describes the release process?
>> >
>>
>


Re: [VOTE] Release Apache Druid (incubating) 0.13.0 [RC1]

2018-10-30 Thread Gian Merlino
-1 because there are issues tagged for 0.13.0 that are not part of this
release:

- https://github.com/apache/incubator-druid/pull/6512
- https://github.com/apache/incubator-druid/pull/6514
- https://github.com/apache/incubator-druid/pull/6516
- https://github.com/apache/incubator-druid/pull/6508
- https://github.com/apache/incubator-druid/pull/6520
- https://github.com/apache/incubator-druid/issues/6546

I also noticed a doc problem that I haven't filed an issue for yet, which
is that the tutorials download section isn't accurate any longer. It says
to use http://static.druid.io/, but we probably won't put the artifacts
there. And it says to do "cd druid-#{DRUIDVERSION}" but now the directory
name is "apache-druid" not "druid".

I tried to go through an Apache style verification process anyway. Here's
what I looked at.

Source release:
- GPG signature and SHA512 are ok
- Tarball name and structure looks ok
- LICENSE, NOTICE, and DISCLAIMER are present
- Code builds and tests pass (by running "mvn package")
- Cloned a fresh Druid repo, checked out druid-0.13.0-incubating-rc1
(acf15b42778d3a84638193a3b07c6814cf2f35a2), and compared it to the source
release. The source release has two extra files: git.version (expected)
and extensions-core/protobuf-extensions/dependency-reduced-pom.xml
(unexpected). Perhaps the latter should be removed. The source release is
_missing_ a few files that I am ok with them missing, since they don't seem
to be necessary in a source release (.gitignore, .idea, .travis.yml,
eclipse.importorder, eclipse_formatting.xml, publications, upload.sh).

Binary "release" (not really a release, from what I understand, but still
important):
- GPG signature and SHA512 are ok
- Tarball name and structure looks ok. It expands to
"apache-druid-0.13.0-incubating" not "apache-druid-0.13.0-incubating-bin"
but IMO that is fine.
- Did the "tutorial-batch" quickstart and verified data could be loaded and
queried.

On Tue, Oct 23, 2018 at 11:56 AM Julian Hyde  wrote:

> Your’e right. After I imported the keys, using
>
> $ gpg --recv-keys 58B5D669D2FFD83B37D88DF8BB64B3727183DE56
>
> it worked. I could have done ‘gpg —import KEYS’ also.
>
> Changing my vote to +0 (binding). I would give +1 but the artifacts I am
> asked to review contains a bin.tar.gz and I have no idea how to review
> binary artifacts. Maybe some other reviewers can chime in.
>
> I do know how to find out online how to build Druid. I think it is
> important that src.tar.gz is self-contained, and that includes the
> necessary build instructions. (Just as I really appreciate when a box of
> pasta has “boil for 12 minutes” written on the side.)
>
> I don’t know whether Apache has guidelines for PaxHeaders. I’m just saying
> it looked weird to me. And by the way, emacs tar-mode choked on the tar
> file. Not a blocker, just friction.
>
> There is a way to add headers to .md files. And I claim there is just as
> much creativity in these as in source code. Please fix in the next release.
>
> Julian
>
>
> > On Oct 22, 2018, at 10:04 PM, David Lim  wrote:
> >
> > Hi Julian,
> >
> > I believe the PaxHeader files are the result of extracting a tarball
> built
> > with POSIX tar with a GNU or other variant. I believe it is a result of
> > this configuration:
> >
> https://github.com/apache/incubator-druid/blob/master/distribution/pom.xml#L195
> >
> > Does Apache have guidelines on what variant of tar should be used in
> > generating release artifacts?
> >
> > Also, just thought I would note that we do have published documentation
> on
> > building from source here:
> >
> https://github.com/apache/incubator-druid/blob/master/docs/content/development/build.md
> > - but there are a few statements that should be updated, hence my comment
> > that I'll make sure the docs get updated.
> >
> > Regards,
> > David
> >
> >
> > On Mon, Oct 22, 2018 at 10:20 PM David Lim  wrote:
> >
> >> Hi Julian,
> >>
> >> Thank you for the thorough review!
> >>
> >> For the GPG key, my understanding is that it was expected that users
> would
> >> fetch the key by either running 'gpg --import KEYS' on the KEYS file (as
> >> per https://www.apache.org/dev/release-signing.html#keys-policy) or by
> >> importing it from the Apache phonebook (
> >> https://people.apache.org/keys/committer/davidlim.asc) or grabbing it
> >> from a well-known key server (e.g.
> >> http://pgp.mit.edu/pks/lookup?search=davidlim%40apache.org=index).
> Did
> >> this not work for you?
> >>
> >>> src.tar.gz file contains files such as
> >>
> ./PaxHeaders.X/apache-druid-0.13.0-incubating-src_indexing-service_src_test_java_org_apache_druid_s
> >> I will check to see whether these files are expected or not.
> >>
> >> The files you identified from the diff against the git tag are all
> either
> >> expected to be omitted or were generated as part of the source
> packaging.
> >>
> >> For instructions on building the release from source, I mentioned the
> >> command in the original post but it may have been 

Re: Please add me to dev subscriber list

2018-09-29 Thread Gian Merlino
Hey Panner,

You can subscribe to the list by emailing "dev-subscr...@druid.apache.org".

On Sat, Sep 29, 2018 at 1:15 AM Panner selvam Velmyl <
pannerselvam.vel...@gmail.com> wrote:

>
>


Re: Druid Developer Contribution

2018-10-03 Thread Gian Merlino
Hi Ravi,

All of us use either MacOS or Linux for development, so that issue might be
Windows related (it does use different end-of-line markers from what
MacOS/Linux use).

On Wed, Oct 3, 2018 at 4:19 AM Ravi Kumar Gadagotti <
ravikumargadago...@gmail.com> wrote:

> Hi,
>
> I am getting the following error message when I am compiling using the
> maven or eclipse, I am not sure where to modify the POM to get it to work.
>
> And one more question is, can I edit the code in windows or should I do it
> Linux/Unix or mac os?
>
> [ERROR]
>
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\common\config\NullHandling.java:0:
> File does not end with a newline. [NewlineAtEndOfFile]
> [ERROR]
>
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\common\config\NullValueHandlingConfig.java:0:
> File does not end with a newline. [NewlineAtEndOfFile]
> [ERROR]
>
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\concurrent\ConcurrentAwaitableCounter.java:0:
> File does not end with a newline. [NewlineAtEndOfFile]
> [ERROR]
>
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\guice\annotations\ExtensionPoint.java:0:
> File does not end with a newline. [NewlineAtEndOfFile]
>
> On Wed, Oct 3, 2018 at 7:08 AM Ravi Kumar Gadagotti <
> ravikumargadago...@gmail.com> wrote:
>
> > Hi,
> >
> > I am getting the following error message when I am compiling using the
> > maven or eclipse, I am not sure where to modify the POM to get it to
> work.
> >
> > [ERROR]
> >
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\common\config\NullHandling.java:0:
> > File does not end with a newline. [NewlineAtEndOfFile]
> > [ERROR]
> >
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\common\config\NullValueHandlingConfig.java:0:
> > File does not end with a newline. [NewlineAtEndOfFile]
> > [ERROR]
> >
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\concurrent\ConcurrentAwaitableCounter.java:0:
> > File does not end with a newline. [NewlineAtEndOfFile]
> > [ERROR]
> >
> G:\incubator-druid-master\incubator-druid-master\java-util\src\main\java\org\apache\druid\guice\annotations\ExtensionPoint.java:0:
> > File does not end with a newline. [NewlineAtEndOfFile]
> >
> > On Wed, Oct 3, 2018 at 1:48 AM Surekha Saharan  >
> > wrote:
> >
> >> Hi Ravi,
> >>
> >> After you git cloned the project, are you able to build it using *mvn
> >> clean
> >> install **-DskipTests, *before importing into eclipse ? Also make sure
> >> your
> >> eclipse java compiler and runtime are set to version 1.8
> >>
> >> In case you are open to use intelliJ,  there are some guidelines on
> >> intelliJ here
> >> https://github.com/apache/incubator-druid/blob/master/INTELLIJ_SETUP.md
> .
> >>
> >> Good luck,
> >> Surekha
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Oct 2, 2018 at 4:49 PM Ravi Kumar Gadagotti <
> >> ravikumargadago...@gmail.com> wrote:
> >>
> >> > Hi Surekha,
> >> >
> >> > Hope you are free to answer my silly questions...
> >> >
> >> > I just downloaded the source code and imported it to eclipse and when
> I
> >> am
> >> > updating the project, I am getting so many errors that I cannot even
> >> > compile or update the project in eclipse using maven. If possible can
> I
> >> get
> >> > any help from any of the existing developers for initial setup?
> >> >
> >> > Thanks,
> >> > Ravi Kumar Gadagotti.
> >> >
> >> > On Tue, Oct 2, 2018 at 3:13 PM Surekha Saharan <
> >> surekha.saha...@imply.io>
> >> > wrote:
> >> >
> >> > > Hi Ravi,
> >> > >
> >> > > It's great that you are interested in contributing to Druid!
> >> > >
> >> > > You can do the following to get started:
> >> > > - Checkout the community page here  http://druid.io/community/
> >> > > - Subscribe to the dev list
> >> > > - Checkout the open issues here
> >> > > https://github.com/apache/incubator-druid/issues/ (may be the ones
> >> that
> >> > > are
> >> > > marked easy)
> >> > > - Check this on contributing guidelines :
> >> > >
> https://github.com/apache/incubator-druid/blob/master/CONTRIBUTING.md
> >> > >
> >> > > Good luck,
> >> > > Surekha
> >> > >
> >> > > On Tue, Oct 2, 2018 at 12:04 PM Ravi Kumar Gadagotti <
> >> > > ravikumargadago...@gmail.com> wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > My name is Ravi Kumar Gadagotti, and I want to contribute code to
> >> the
> >> > > druid
> >> > > > community and I have no idea where to start and how to start, I
> am a
> >> > java
> >> > > > developer with good amount of experience in Java as well as hadoop
> >> so
> >> > > > please let me know how can I help or contribute to this project.
> >> > > >
> >> > > > Thanks,
> >> > > > Ravi Kumar Gadagotti.
> >> > > >
> >> > >
> >> >
> >>
> >
>


  1   2   3   4   5   >