Hi, My organization alreadly uses the official C# Arrow library for a product.
It seems that the official library is working fine on the product, so I think it has some stability compared to what it used to be. Moreover, I think that if we focus on the official C# implementation, we can test it in more environments including the product, which makes it easier to improve quality or stability. Kind regards, ---- Takashi Hashida On 2020/07/12 22:11:20, anthony.ab...@gmail.com wrote: > Wes, > > >I find that people have many reasons for not contributing to > >an existing open source project, so I want to make sure I know what > >yours are, whether one of: > > For the record, I have contributed to this project: both PRs and Jiras for > various 'bugs' as I found them - some (most?) made their way in as a fix so > I don't agree with this statement or any of your 3 choices. > > Regarding our private fork - we did a lot of testing to get something with > stability (I also reported and identified a 'random' crash bug that should > be a showstopper for any production application) - we don't want to repeat > this level of testing until we have a need. > > -Anthony > > > On Sun, Jul 12, 2020 at 4:57 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > On Sun, Jul 12, 2020 at 2:44 PM <anthony.ab...@gmail.com> wrote: > > > > > > Wes, > > > > > > I thought Arrow was (or at least includes) an open standard for > > > interoperability? There are even specific 'implementation guidelines' > > > regarding supporting parts or all of the specification. > > > > That's true, but at the moment there is not any C# library available > > that has been demonstrated (by passing the integration tests) to > > correctly implement the columnar specification. This idea of > > "instability" or "precariousness" is a non-issue if the C# development > > community will follow the example of the other reference > > implementations and implement the integration tests. We discussed this > > in a JIRA last fall > > > > https://issues.apache.org/jira/browse/ARROW-7156 > > > > To summarize, without integration tests, an Arrow reference > > implementation can't be considered seaworthy. For example, recently > > the Rust library found only through integration testing that some > > parts of the format aren't implemented correctly. > > > > In any case, I'd like to do what I can to help the C# ecosystem have a > > trustworthy reference implementation of Arrow that can be used to > > build production applications. > > > > > It appears that fragmentation is already a problem (ie private forks) > > > > Private forks are only a problem if there is a permanent divergence > > with no intent to upstream patches. With any open source project you > > see organizations apply patches to upstream for reasons of business > > expediency and then work to upstream those patches. > > > > > Where I work, we don't trust the C# library to do anything other than > > what > > > we know works: writing certain large files with only a subset of the > > > supported column types. We had even considered switching to C++, but I > > > was able to get something stable. To give you an idea of how precarious > > it > > > is though, we can't even read the files we just created (but we know they > > > work since they open fine in R) We decided having a write only library > > > was 'good enough' since we don't need to consume the files ourselves. > > > > > > I decided sometime ago that to get the features I want / need out of an > > > Arrow library, it was easier to build an independent implementation > > > directly from the spec/.fbs, rather than try to apply bandages to what > > > already existed. Aside from the numerous bugs, the current library is > > just > > > not designed for parallelism and speed. > > > > I think it's fine to enumerate your criticisms of the current codebase > > and make constructive recommendations about what you would like to see > > change. I find that people have many reasons for not contributing to > > an existing open source project, so I want to make sure I know what > > yours are, whether one of: > > > > * Not wanting to refactor and work within an existing codebase > > * Belief that there is resistance or difficulty having patches accepted > > * For reasons of business expediency, not wanting to collaborate (e.g. > > in code reviews) with developers outside of your organization or > > participate in a process where one does not have unilateral control > > over when commits are merged to master > > > > I haven't seen any resistance to C# PRs. If anyone ever is concerned > > about this please raise it on the mailing list. > > > > Thanks, > > Wes > > > > > > > > On Sun, Jul 12, 2020 at 1:36 PM Wes McKinney <wesmck...@gmail.com> > > wrote: > > > > > > > hi Anthony, > > > > > > > > On Sun, Jul 12, 2020 at 12:13 PM <anthony.ab...@gmail.com> wrote: > > > > > > > > > > I am in the same position as Adam - We don't use the official apache > > > > arrow > > > > > library any more either and have been using an old fork with our own > > > > > (probably the same) bug fixes. > > > > > > > > > > Personally, I have somewhat given up on the Apache .Net library... I > > have > > > > > an alternative C# arrow library that I have written (from the flat > > > > buffers > > > > > spec) that has C# features I need / want... Async/Await - Tasks, > > > > > IAsyncEnumerable, multi-threading / high performance/ serialization > > > > > plugins, etc) - I am considering releasing it since I think many > > others > > > > > could benefit from it over the current library. > > > > > > > > I'm a bit confused by this, it seems like avoiding fragmentation and > > > > having a canonical library that is well-supported by the community is > > > > the goal we are all working toward. Why would the current library not > > > > evolve to have the features you need? I don't think there is any > > > > barrier (aside from having to respond to code review comments) to > > > > having patches accepted. > > > > > > > > From my perspective, we accepted an initial C# code donation from > > > > Feyen Zylstra but then there were no many further contributions from > > > > this organization. Eric from Microsoft has done some development work, > > > > but otherwise it seems like we are still in "community bootstrapping" > > > > mode. If there are individuals who are invested in having a good > > > > standard Arrow library for C#, you are as free as any other open > > > > source contributor to take up a de facto leadership role in the > > > > project. > > > > > > > > To have an Arrow library that can be trusted for mission critical work > > > > (i.e. that passes the integration test suite, in particular) is a > > > > significant amount of work, so I'm concerned if the C# community does > > > > not pool efforts on this that the most likely outcome is that Arrow as > > > > a technology will simply fail to get traction in the .NET world. > > > > > > > > > -Anthony > > > > > > > > > > > > > > > On Fri, Jul 10, 2020 at 2:23 PM Eric Erhardt > > > > > <eric.erha...@microsoft.com.invalid> wrote: > > > > > > > > > > > I agree with Adam, the more usage and feedback we can get the > > better on > > > > > > the .NET Library. > > > > > > > > > > > > > However there is no library for C# listed anywhere else in the > > > > > > > documentation. > > > > > > > > > > > > We have some XML style doc comments in the code. It would be great > > if > > > > we > > > > > > could generate a website/markdown from those XML files produced by > > the > > > > > > build. And then get it shown under the Documentation tab on > > > > > > https://arrow.apache.org/. I've opened > > > > > > https://issues.apache.org/jira/browse/ARROW-9406 for this. > > > > > > > > > > > > Eric > > > > > > > > > > > > -----Original Message----- > > > > > > From: Adam Szmigin <adam.szmi...@xsco.net> > > > > > > Sent: Friday, July 10, 2020 6:28 AM > > > > > > To: dev@arrow.apache.org > > > > > > Subject: [EXTERNAL] Re: .NET support for Arrow > > > > > > > > > > > > Hi Yash, > > > > > > > > > > > > My organisation is using the C# library for a product we are > > working > > > > on. > > > > > > However, we are using a fork which includes a number of bug-fixes > > for > > > > > > issues that would have otherwise blocked us. I've raised a few PRs > > to > > > > fix > > > > > > these upstream. > > > > > > > > > > > > I think it's fair to say that the C# library is at an early stage > > of > > > > > > development at the moment. The more people who are able to test > > and > > > > > > contribute back, the better. > > > > > > > > > > > > Kind regards, > > > > > > > > > > > > > > > > > > -- > > > > > > Adam Szmigin > > > > > > > > > > > > On 10/07/2020 04:05, Yash Ganthe wrote: > > > > > > > Hi, > > > > > > > > > > > > > > The first paragraph of docs at > > > > > > > > > > > > > > > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Farrow.apache.org%2F&data=02%7C01%7CEric.Erhardt%40microsoft.com%7C150d7a7f5f1a4274567008d824c46983%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299773289674614&sdata=IbmMQwZMqlo0Ya7ocgfNrZAsHruErwB%2Bg1DuD7qqzm0%3D&reserved=0 > > > > > > says it supports C#. > > > > > > > However there is no library for C# listed anywhere else in the > > > > > > > documentation. Is .NET supported at all? > > > > > > > > > > > > > > Regards, > > > > > > > Yash > > > > > > > > > > > > > > > > > > > >