Re: [Java] PR Reviewers
Thanks for the offers I'll try to do a triage pass in the next few days and tag some of the people who have volunteered. Cheers, Micah On Mon, Jan 27, 2020 at 10:40 AM Ryan Murray wrote: > Hey all, I would love to help out. Is there any specific ones that are > relatively easy for me to get started on? > > On Mon, 27 Jan 2020, 18:31 Bryan Cutler, wrote: > > > Hi Micah, I don't have a ton of bandwidth at the moment, but I'll try to > > review some more PRs. Anyone, please feel free to ping me too if you > have a > > stale PR that needs some help getting through. Outreach to other Java > > communities sounds like a good idea - more Java users would definitely > be a > > good thing! > > > > Bryan > > > > On Mon, Jan 27, 2020 at 8:12 AM Andy Grove > wrote: > > > > > I've now started working with the Java implementation of Arrow, > > > specifically Flight, and would be happy to help although I do have > > limited > > > time each week. I can at least review from a Java correctness point of > > > view. > > > > > > Andy. > > > > > > On Thu, Jan 23, 2020 at 9:41 PM Micah Kornfield > > > > wrote: > > > > > > > I mentioned this elsewhere but my intent is to stop doing java > reviews > > > for > > > > the immediate future once I wrap up the few that I have requested > > change > > > > on. > > > > > > > > I'm happy to try to triage incoming Java PRs, but in order to do > this, > > I > > > > need to know which committers have some bandwidth to do reviews (some > > of > > > > the existing PRs I've tagged people who never responded). > > > > > > > > Thanks, > > > > Micah > > > > > > > > > >
Re: [Java] PR Reviewers
> > Somewhat related, but are there any thoughts about growing the Java > developer community generally? Perhaps we could do some outreach to > other Java-focused Apache communities (Iceberg comes to mind, but > there may be others)? I'm all for this. I think one of the things that we are lacking a little bit on the Java side of things is a clear idea of what we want to build into Apache Arrow proper. For instance, in the past, I've been -0.5 on trying to replicate the work that is on-going on the C++ side of things, but maybe we should reconsider that? Or at least more JNI bindings? Getting more input on this would be useful especially from those outside the community. I still think a strong set of adapter libraries, especially if we can make them "best of class" in performance would be beneficial for adoption. Not directly related, but it would be nice if Java contributors could > fill the holes in the 0.16.0 release blog post. Currently the Java > section is empty: > https://github.com/apache/arrow-site/pull/41 I put a few bullet points in. On Mon, Jan 27, 2020 at 11:08 AM Antoine Pitrou wrote: > > Not directly related, but it would be nice if Java contributors could > fill the holes in the 0.16.0 release blog post. Currently the Java > section is empty: > https://github.com/apache/arrow-site/pull/41 > > Regards > > Antoine. > > > Le 27/01/2020 à 19:40, Ryan Murray a écrit : > > Hey all, I would love to help out. Is there any specific ones that are > > relatively easy for me to get started on? > > >
[jira] [Created] (ARROW-7698) [Format][C++] Add tensor and sparse tensor supports in File metadata
Kenta Murata created ARROW-7698: --- Summary: [Format][C++] Add tensor and sparse tensor supports in File metadata Key: ARROW-7698 URL: https://issues.apache.org/jira/browse/ARROW-7698 Project: Apache Arrow Issue Type: New Feature Components: C++, Format Reporter: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [Format] Array/RowBatch filters
Thanks for all the input: > I think having support for this in some way in the IPC > protocol makes sense (it seems slightly less important for the C API > but worth thinking about The way I read Jacques e-mail is it seems like the opposite might be true (at least for Dremio). For IPC I think there is probably a sweet spot where it doesn't pay to compact the batches but it would like take some tuning. > The question is how mechanically, would it be some extra buffers at > the start or end of the record batch body (probably have to be at the > end of the body for forward compatibility reasons)? I think for RecordBatch it would be an extra buffer either at the beginning for the end. Its possible putting at the end would allow better forwards compatibility. I haven't really given much thought on design here. My main concern is to define appropriate metadata before 1.0.0 to maintain forwards compatibility. My thinking is the metadata would be an enum or null table that indicates "no filters". Implementations could then determine if they know how to understand the corresponding buffers correctly based on the metadata. I can try to put up a straw-man PR for metadata if we think this is worth pursuing further. Thanks, Micah P.S. This also raises a slightly related concern about letting applications negotiate "capabilities" at a finer grained level (e.g. letting the transmitter know that the receive only supports unfiltered values). On Mon, Jan 27, 2020 at 8:34 PM Wes McKinney wrote: > hi Micah -- I think having support for this in some way in the IPC > protocol makes sense (it seems slightly less important for the C API > but worth thinking about). It's helpful to know that Dremio (a big > Arrow user) already employs various filters / selection vectors. > > The question is how mechanically, would it be some extra buffers at > the start or end of the record batch body (probably have to be at the > end of the body for forward compatibility reasons)? > > On Sun, Jan 26, 2020 at 1:16 PM Jacques Nadeau wrote: > > > > At Dremio, we use four main types of selection vector/bitmaps: > > > > Dense Format (record valid or not, no ordering) > > - single bit (bitmap) > > > > Sparse formats (identifies valid records as well as their order) > > - 2 byte (for record batches up to 2^16 records). > > - 4 byte (for 2^16 batches of 2^16 records); > > - 6 byte (for 2^32 batches of 2^16 records); > > > > We've considered introducing a couple more. I imagine for other use > cases, > > where people use much larger batches of records, different requirements > > would be necessary. My reason for sharing is it seems like this may be > > use-case specific. I'd also note that at the IPC level, you'd generally > > want to contract batches before dropping them on the wire (or at least > that > > is what we typically do). > > > > On Fri, Jan 24, 2020 at 11:23 PM Micah Kornfield > > wrote: > > > > > I was thinking selection vector/bitmap (possibly with different > encodings), > > > but really nothing for now. Ordinarily, I'd lean towards YAGNI but > there > > > isn't a good way to add this in easily in a forward compatible way > unless > > > we add a placeholder enum/table for 1.0 (the default option would be no > > > filter and wouldn't change the packaged data at all). > > > > > > On Fri, Jan 24, 2020 at 4:55 AM Francois Saint-Jacques < > > > fsaintjacq...@gmail.com> wrote: > > > > > > > By filter, you mean a filter expression, or a selection > vector/bitmap? > > > > > > > > On Thu, Jan 23, 2020 at 11:38 PM Micah Kornfield < > emkornfi...@gmail.com> > > > > wrote: > > > > > > > > > > One of the things that I think got overlooked in the conversation > on > > > > having > > > > > a slice offset in the C API was a suggestion from Jacques of > perhaps > > > > > generalizing the concept to an arbitrary "filter" for arrays/record > > > > batches. > > > > > > > > > > I believe this point was also discussed in the past as well. I'm > not > > > > > advocating for adding it now but I'm curious if people feel we > should > > > add > > > > > something to Schema.fbs for forward compatibility, in case we > wish to > > > > > support this use-case in the future. > > > > > > > > > > Thanks, > > > > > Micah > > > > > > > >
Re: [Format] Array/RowBatch filters
hi Micah -- I think having support for this in some way in the IPC protocol makes sense (it seems slightly less important for the C API but worth thinking about). It's helpful to know that Dremio (a big Arrow user) already employs various filters / selection vectors. The question is how mechanically, would it be some extra buffers at the start or end of the record batch body (probably have to be at the end of the body for forward compatibility reasons)? On Sun, Jan 26, 2020 at 1:16 PM Jacques Nadeau wrote: > > At Dremio, we use four main types of selection vector/bitmaps: > > Dense Format (record valid or not, no ordering) > - single bit (bitmap) > > Sparse formats (identifies valid records as well as their order) > - 2 byte (for record batches up to 2^16 records). > - 4 byte (for 2^16 batches of 2^16 records); > - 6 byte (for 2^32 batches of 2^16 records); > > We've considered introducing a couple more. I imagine for other use cases, > where people use much larger batches of records, different requirements > would be necessary. My reason for sharing is it seems like this may be > use-case specific. I'd also note that at the IPC level, you'd generally > want to contract batches before dropping them on the wire (or at least that > is what we typically do). > > On Fri, Jan 24, 2020 at 11:23 PM Micah Kornfield > wrote: > > > I was thinking selection vector/bitmap (possibly with different encodings), > > but really nothing for now. Ordinarily, I'd lean towards YAGNI but there > > isn't a good way to add this in easily in a forward compatible way unless > > we add a placeholder enum/table for 1.0 (the default option would be no > > filter and wouldn't change the packaged data at all). > > > > On Fri, Jan 24, 2020 at 4:55 AM Francois Saint-Jacques < > > fsaintjacq...@gmail.com> wrote: > > > > > By filter, you mean a filter expression, or a selection vector/bitmap? > > > > > > On Thu, Jan 23, 2020 at 11:38 PM Micah Kornfield > > > wrote: > > > > > > > > One of the things that I think got overlooked in the conversation on > > > having > > > > a slice offset in the C API was a suggestion from Jacques of perhaps > > > > generalizing the concept to an arbitrary "filter" for arrays/record > > > batches. > > > > > > > > I believe this point was also discussed in the past as well. I'm not > > > > advocating for adding it now but I'm curious if people feel we should > > add > > > > something to Schema.fbs for forward compatibility, in case we wish to > > > > support this use-case in the future. > > > > > > > > Thanks, > > > > Micah > > > > >
[jira] [Created] (ARROW-7697) [Release] Add a test for updating Linux packages by 00-prepare.sh
Kouhei Sutou created ARROW-7697: --- Summary: [Release] Add a test for updating Linux packages by 00-prepare.sh Key: ARROW-7697 URL: https://issues.apache.org/jira/browse/ARROW-7697 Project: Apache Arrow Issue Type: Improvement Components: Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7696) [Release] Unit test on release branch is failed
Kouhei Sutou created ARROW-7696: --- Summary: [Release] Unit test on release branch is failed Key: ARROW-7696 URL: https://issues.apache.org/jira/browse/ARROW-7696 Project: Apache Arrow Issue Type: Improvement Components: Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou https://github.com/kszucs/arrow/runs/410980755 {noformat} 8 tests, 6 assertions, 1 failures, 2 errors, 0 pendings, 0 omissions, 0 notifications {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: PR Dashboard for Java?
Bryan -- I just gave you (cutlerb) Confluence edit privileges. These have to be explicitly managed on a per-user basis to avoid spam problems On Mon, Jan 27, 2020 at 4:12 PM Bryan Cutler wrote: > > Thanks Neal, but it doesn't look like I have confluence privileges. That's > fine though, the github interface is easy enough. > > On Mon, Jan 27, 2020 at 11:59 AM Neal Richardson < > neal.p.richard...@gmail.com> wrote: > > > If you have confluence privileges, duplicate a page like > > https://cwiki.apache.org/confluence/display/ARROW/Ruby+JIRA+Dashboard and > > then edit the Jira query (something like status in open/in > > progress/reopened, labels = pull-request-available, component = java, > > project = ARROW) if you want to make it Java issues that have pull requests > > open. > > > > Or you could bookmark > > > > https://github.com/apache/arrow/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+%22%5BJava%5D%22 > > or https://github.com/apache/arrow/labels/lang-java > > > > Neal > > > > On Mon, Jan 27, 2020 at 11:26 AM Bryan Cutler wrote: > > > > > I saw on Confluence that other Arrow components have PR dashboards, but I > > > don't see one for Java? I think it would be helpful, is it difficult to > > add > > > one for Java? I'm happy to do it if someone could point me in the right > > > direction. Thanks! > > > > > > Bryan > > > > >
[jira] [Created] (ARROW-7695) [Release] Update java versions to 0.16-SNAPSHOT
Krisztian Szucs created ARROW-7695: -- Summary: [Release] Update java versions to 0.16-SNAPSHOT Key: ARROW-7695 URL: https://issues.apache.org/jira/browse/ARROW-7695 Project: Apache Arrow Issue Type: Improvement Reporter: Krisztian Szucs Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: PR Dashboard for Java?
Thanks Neal, but it doesn't look like I have confluence privileges. That's fine though, the github interface is easy enough. On Mon, Jan 27, 2020 at 11:59 AM Neal Richardson < neal.p.richard...@gmail.com> wrote: > If you have confluence privileges, duplicate a page like > https://cwiki.apache.org/confluence/display/ARROW/Ruby+JIRA+Dashboard and > then edit the Jira query (something like status in open/in > progress/reopened, labels = pull-request-available, component = java, > project = ARROW) if you want to make it Java issues that have pull requests > open. > > Or you could bookmark > > https://github.com/apache/arrow/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+%22%5BJava%5D%22 > or https://github.com/apache/arrow/labels/lang-java > > Neal > > On Mon, Jan 27, 2020 at 11:26 AM Bryan Cutler wrote: > > > I saw on Confluence that other Arrow components have PR dashboards, but I > > don't see one for Java? I think it would be helpful, is it difficult to > add > > one for Java? I'm happy to do it if someone could point me in the right > > direction. Thanks! > > > > Bryan > > >
[jira] [Created] (ARROW-7694) [Packaging][deb][RPM] Can't build repository packages for RC
Kouhei Sutou created ARROW-7694: --- Summary: [Packaging][deb][RPM] Can't build repository packages for RC Key: ARROW-7694 URL: https://issues.apache.org/jira/browse/ARROW-7694 Project: Apache Arrow Issue Type: Improvement Components: Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou apache-arrow-archive-keyring failure: https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=5737=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=13284 {noformat} 2020-01-27T16:02:31.2221451Z /host/build.sh: 27: cd: can't cd to apache-arrow-archive-keyring-0.16.0/ {noformat} apache-arrow-release failure: https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=5774=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=10330 {noformat} /var/tmp/rpm-tmp.IfEC8a: line 39: cd: apache-arrow-release-0.16.0: No such file or directory {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: PR Dashboard for Java?
If you have confluence privileges, duplicate a page like https://cwiki.apache.org/confluence/display/ARROW/Ruby+JIRA+Dashboard and then edit the Jira query (something like status in open/in progress/reopened, labels = pull-request-available, component = java, project = ARROW) if you want to make it Java issues that have pull requests open. Or you could bookmark https://github.com/apache/arrow/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+%22%5BJava%5D%22 or https://github.com/apache/arrow/labels/lang-java Neal On Mon, Jan 27, 2020 at 11:26 AM Bryan Cutler wrote: > I saw on Confluence that other Arrow components have PR dashboards, but I > don't see one for Java? I think it would be helpful, is it difficult to add > one for Java? I'm happy to do it if someone could point me in the right > direction. Thanks! > > Bryan >
Re: [DISCUSS][JAVA] Correct the behavior of ListVector isEmpty
Return a null might be more correct since `getObject(int index)` also return a null value if not set, but I don't think it's worth making a more complicated API for this. It should be fine to return `false` for a null value. +1 for treating nulls as empty. On Fri, Jan 24, 2020 at 9:12 AM Brian Hulette wrote: > What about returning null for a null list? It looks like now the function > returns a primitive boolean, so I guess that would be a substantial change, > but null seems more correct to me. > > On Thu, Jan 23, 2020, 21:38 Micah Kornfield wrote: > > > I would vote for treating nulls as empty. > > > > On Fri, Jan 10, 2020 at 12:36 AM Ji Liu > > wrote: > > > > > Hi all, > > > > > > Currently isEmpty API is always return false in > BaseRepeatedValueVector, > > > and its subclass ListVector did not overwrite this method. > > > This will lead to incorrect result, for example, a ListVector with data > > > [1,2], null, [], [5,6] would get [false, false, false, false] which is > > not > > > right. > > > I opened a PR to fix this[1] and not sure what’s the right behavior for > > > null value, should it return [false, false, true, false] or [false, > true, > > > true, false] ? > > > > > > > > > Thanks, > > > Ji Liu > > > > > > > > > [1] https://github.com/apache/arrow/pull/6044 > > > > > > > > >
PR Dashboard for Java?
I saw on Confluence that other Arrow components have PR dashboards, but I don't see one for Java? I think it would be helpful, is it difficult to add one for Java? I'm happy to do it if someone could point me in the right direction. Thanks! Bryan
Re: [Java] PR Reviewers
Not directly related, but it would be nice if Java contributors could fill the holes in the 0.16.0 release blog post. Currently the Java section is empty: https://github.com/apache/arrow-site/pull/41 Regards Antoine. Le 27/01/2020 à 19:40, Ryan Murray a écrit : > Hey all, I would love to help out. Is there any specific ones that are > relatively easy for me to get started on? >
Re: [Java] PR Reviewers
Hey all, I would love to help out. Is there any specific ones that are relatively easy for me to get started on? On Mon, 27 Jan 2020, 18:31 Bryan Cutler, wrote: > Hi Micah, I don't have a ton of bandwidth at the moment, but I'll try to > review some more PRs. Anyone, please feel free to ping me too if you have a > stale PR that needs some help getting through. Outreach to other Java > communities sounds like a good idea - more Java users would definitely be a > good thing! > > Bryan > > On Mon, Jan 27, 2020 at 8:12 AM Andy Grove wrote: > > > I've now started working with the Java implementation of Arrow, > > specifically Flight, and would be happy to help although I do have > limited > > time each week. I can at least review from a Java correctness point of > > view. > > > > Andy. > > > > On Thu, Jan 23, 2020 at 9:41 PM Micah Kornfield > > wrote: > > > > > I mentioned this elsewhere but my intent is to stop doing java reviews > > for > > > the immediate future once I wrap up the few that I have requested > change > > > on. > > > > > > I'm happy to try to triage incoming Java PRs, but in order to do this, > I > > > need to know which committers have some bandwidth to do reviews (some > of > > > the existing PRs I've tagged people who never responded). > > > > > > Thanks, > > > Micah > > > > > >
[jira] [Created] (ARROW-7693) [CI] Fix test-conda-python-3.7-spark-master nightly errors
Bryan Cutler created ARROW-7693: --- Summary: [CI] Fix test-conda-python-3.7-spark-master nightly errors Key: ARROW-7693 URL: https://issues.apache.org/jira/browse/ARROW-7693 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration Reporter: Bryan Cutler Assignee: Bryan Cutler Spark master renamed some tests, need to update -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [Java] PR Reviewers
Hi Micah, I don't have a ton of bandwidth at the moment, but I'll try to review some more PRs. Anyone, please feel free to ping me too if you have a stale PR that needs some help getting through. Outreach to other Java communities sounds like a good idea - more Java users would definitely be a good thing! Bryan On Mon, Jan 27, 2020 at 8:12 AM Andy Grove wrote: > I've now started working with the Java implementation of Arrow, > specifically Flight, and would be happy to help although I do have limited > time each week. I can at least review from a Java correctness point of > view. > > Andy. > > On Thu, Jan 23, 2020 at 9:41 PM Micah Kornfield > wrote: > > > I mentioned this elsewhere but my intent is to stop doing java reviews > for > > the immediate future once I wrap up the few that I have requested change > > on. > > > > I'm happy to try to triage incoming Java PRs, but in order to do this, I > > need to know which committers have some bandwidth to do reviews (some of > > the existing PRs I've tagged people who never responded). > > > > Thanks, > > Micah > > >
[jira] [Created] (ARROW-7692) [Rust] Several pattern matches are hard to read
François Garillot created ARROW-7692: Summary: [Rust] Several pattern matches are hard to read Key: ARROW-7692 URL: https://issues.apache.org/jira/browse/ARROW-7692 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: François Garillot Several pattern matches can be rewritten directly using a combinator, e.g. array's `value_as_date`, more succintly expressed as a `map`: {{ match self.value_as_datetime(i) {}} {{ Some(datetime) => Some(datetime.date()),}} {{ None => None,}} {{ }}} More importantly some of these matches obscure what the code is doing, e.g. parquet column writer `read_fully`'s extraction of a mutable slice: {{let actual_def_levels = match def_levels {}} {{ Some(ref mut vec) => Some( vec[..]),}} {{ None => None,}} {{ };}} which can be written, using `as_mut` and `map`, as: {{let actual_def_levels = def_levels.as_mut().map(|vec| vec[..]);}} A large # of these are meant to be addressed in [https://github.com/apache/arrow/pull/6292/files] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7691) [C++] Verify missing fields when walking Flatbuffers data
Antoine Pitrou created ARROW-7691: - Summary: [C++] Verify missing fields when walking Flatbuffers data Key: ARROW-7691 URL: https://issues.apache.org/jira/browse/ARROW-7691 Project: Apache Arrow Issue Type: Task Components: C++ Affects Versions: 0.15.1 Reporter: Antoine Pitrou Assignee: Antoine Pitrou This will fix some of the issues detected by OSS-Fuzz. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7690) Cannot write parquet to OutputStream
Bob created ARROW-7690: -- Summary: Cannot write parquet to OutputStream Key: ARROW-7690 URL: https://issues.apache.org/jira/browse/ARROW-7690 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 0.15.1 Reporter: Bob The R package does not allow for the ability to write to a FileOutputStream. Minimal testing code: library(arrow) tf1 <- arrow::FileOutputStream$create(path = "output.parquet") arrow::write_parquet(data.frame(x = 1:5), tf1) Throws error: Error in inherits(sink, OutputStream) : 'what' must be a character vector The issue appears to be in line 153 of parquet.R if (is.character(sink)) { sink <- FileOutputStream$create(sink) on.exit(sink$close()) } *else if (!inherits(sink, OutputStream))* { abort("sink must be a file path or an OutputStream") } Should be !inherits(sink,'OutputStream') -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Improve the ergonomics of new PyArrow FileSystem API in Python ARROW-7584
hi Fabian I responded on the JIRA. I'm generally supportive of ergonomic improvements to the FS API in Python. It might make sense to break the work into multiple patches to ease review burden Thanks for offering to work on this. - Wes On Fri, Jan 24, 2020 at 4:46 AM Fabian Höring wrote: > > Hello, > > I created this ticket to discuss possible improvements of the new PyArrow > FileSystem API > https://issues.apache.org/jira/browse/ARROW-7584 > > As of today there seem to be only two popular projects to have an agnostic > FileSystem API that can handle S3 & HDFS from Python: > - PyArrow via https://arrow.apache.org/docs/python/filesystems.html > - TensorFlow via https://www.tensorflow.org/api_docs/python/tf/io/gfile/GFile > > On my side I would like to reuse a clean FileSystem API in my project and > turned to the arrow for this purpose (I think TensorFlow already handles too > many use cases should not provide yet another feature). > > "Clean FileSystem API" for me also means to cover the interactive use case > where one uses that API like the file system shell commands. We actually used > https://github.com/dask/hdfs3 before and it worked really. > > Currently there is the FileSystem API work in progress (see > https://github.com/apache/arrow/blob/master/python/pyarrow/_fs.pyx#L185) and > I would take the occasion to improve it and fix some issues with the existing > API. > > Can you have a look at the comments on > https://issues.apache.org/jira/browse/ARROW-7584 and give feedback ? > > I can do the implementations I suggest on my side but would like to make sure > they will be accepted. > > Best regards, > Fabian Höring >
Re: [Java] PR Reviewers
I've now started working with the Java implementation of Arrow, specifically Flight, and would be happy to help although I do have limited time each week. I can at least review from a Java correctness point of view. Andy. On Thu, Jan 23, 2020 at 9:41 PM Micah Kornfield wrote: > I mentioned this elsewhere but my intent is to stop doing java reviews for > the immediate future once I wrap up the few that I have requested change > on. > > I'm happy to try to triage incoming Java PRs, but in order to do this, I > need to know which committers have some bandwidth to do reviews (some of > the existing PRs I've tagged people who never responded). > > Thanks, > Micah >
Re: [Java] PR Reviewers
Somewhat related, but are there any thoughts about growing the Java developer community generally? Perhaps we could do some outreach to other Java-focused Apache communities (Iceberg comes to mind, but there may be others)? On Sat, Jan 25, 2020 at 10:14 PM Brian Hulette wrote: > > I'm still pretty new to the Java implementation, but I can probably help > out with some reviews. > > On Thu, Jan 23, 2020 at 8:41 PM Micah Kornfield > wrote: > > > I mentioned this elsewhere but my intent is to stop doing java reviews for > > the immediate future once I wrap up the few that I have requested change > > on. > > > > I'm happy to try to triage incoming Java PRs, but in order to do this, I > > need to know which committers have some bandwidth to do reviews (some of > > the existing PRs I've tagged people who never responded). > > > > Thanks, > > Micah > >
[NIGHTLY] Arrow Build Report for Job nightly-2020-01-27-0
Arrow Build Report for Job nightly-2020-01-27-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0 Failed Tasks: - gandiva-jar-osx: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-travis-gandiva-jar-osx - test-conda-python-3.7-spark-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-spark-master - test-ubuntu-fuzzit-fuzzing: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-ubuntu-fuzzit-fuzzing - test-ubuntu-fuzzit-regression: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-ubuntu-fuzzit-regression - wheel-manylinux2014-cp37m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-wheel-manylinux2014-cp37m Succeeded Tasks: - centos-6: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-centos-6 - centos-7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-centos-7 - centos-8: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-centos-8 - conda-linux-gcc-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-linux-gcc-py27 - conda-linux-gcc-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-linux-gcc-py36 - conda-linux-gcc-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-linux-gcc-py37 - conda-linux-gcc-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-linux-gcc-py38 - conda-osx-clang-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-osx-clang-py27 - conda-osx-clang-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-osx-clang-py36 - conda-osx-clang-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-osx-clang-py37 - conda-osx-clang-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-osx-clang-py38 - conda-win-vs2015-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-win-vs2015-py36 - conda-win-vs2015-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-win-vs2015-py37 - conda-win-vs2015-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-conda-win-vs2015-py38 - debian-buster: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-debian-buster - debian-stretch: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-azure-debian-stretch - gandiva-jar-trusty: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-travis-gandiva-jar-trusty - homebrew-cpp: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-travis-homebrew-cpp - macos-r-autobrew: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-travis-macos-r-autobrew - test-conda-cpp: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-cpp - test-conda-python-2.7-pandas-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-2.7-pandas-latest - test-conda-python-2.7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-2.7 - test-conda-python-3.6: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.6 - test-conda-python-3.7-dask-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-dask-latest - test-conda-python-3.7-hdfs-2.9.2: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-hdfs-2.9.2 - test-conda-python-3.7-pandas-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-pandas-latest - test-conda-python-3.7-pandas-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-pandas-master - test-conda-python-3.7-turbodbc-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-turbodbc-latest - test-conda-python-3.7-turbodbc-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-27-0-circle-test-conda-python-3.7-turbodbc-master -
[jira] [Created] (ARROW-7689) [C++] Sporadic Flight test crash on macOS
Antoine Pitrou created ARROW-7689: - Summary: [C++] Sporadic Flight test crash on macOS Key: ARROW-7689 URL: https://issues.apache.org/jira/browse/ARROW-7689 Project: Apache Arrow Issue Type: Bug Components: C++, FlightRPC Reporter: Antoine Pitrou See this build: https://github.com/apache/arrow/pull/6288/checks?check_run_id=409993893 {code} [--] 2 tests from TestTls [ RUN ] TestTls.DoAction E0127 01:40:23.87112 123145508859904 tls_pthread.cc:26] assertion failed: 0 == pthread_setspecific(tls->key, (void*)value) /Users/runner/runners/2.164.0/work/arrow/arrow/cpp/build-support/run-test.sh: line 97: 32496 Abort trap: 6 $TEST_EXECUTABLE "$@" 2>&1 32497 Done| $ROOT/build-support/asan_symbolize.py 32498 Done| ${CXXFILT:-c++filt} 32499 Done| $ROOT/build-support/stacktrace_addr2line.pl $TEST_EXECUTABLE 32500 Done| $pipe_cmd 2>&1 32501 Done| tee $LOGFILE ~/runners/2.164.0/work/arrow/arrow/build/cpp/src/arrow/flight {code} This is a gRPC issue, reported here: https://github.com/grpc/grpc/issues/20311 We should try to bump bundled gRPC version to see if that fixes the issue. Side note: why aren't we using the homebrew-provided gRPC? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7688) Bump checkstyle from 8.18 to 8.29
Fokko Driesprong created ARROW-7688: --- Summary: Bump checkstyle from 8.18 to 8.29 Key: ARROW-7688 URL: https://issues.apache.org/jira/browse/ARROW-7688 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Assignee: Fokko Driesprong Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)