On Thu, Jan 30, 2020 at 3:43 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>>
>> (FWIW, we developed ArrayDataVisitor primarily for internal library
>> use and not as a public API)
>> I would personally try to first use VisitArrayInline if at all
>> possible since it is simpler
>
>
> Is VisitArrayInline meant to be for public use?  visitor_inline.h still has 
> the disclaimer "Private header, not to be exported".

I think it should be fine for public use -- we should amend the
documentation. Using the "inline" version is simpler in many ways from
the virtual Visitor when you have a templated Visit function that
matches many type cases.

> Thanks,
> Micah
>
> On Wed, Jan 29, 2020 at 8:57 AM Wes McKinney <wesmck...@gmail.com> wrote:
>>
>> On Wed, Jan 29, 2020 at 9:55 AM Calder, Matthew <mcal...@xbktrading.com> 
>> wrote:
>> >
>> > I managed to get conversion from CH to arrow using a CHToArrowType<> 
>> > inter-type traits concept. However, I am still trying to crack the use of:
>> >
>> >  arrow::VisitArrayInline
>>
>> Here's a minimal example of VisitArrayInline
>>
>> struct ArrayVisitor {
>>   Status Visit(const Array& arr) {
>>     return Status::OK();
>>   }
>> };
>>
>> Status VisitArrayInlineExample(const Array& arr) {
>>   ArrayVisitor visitor;
>>   return VisitArrayInline(arr, &visitor);
>> }
>>
>> You can add different Visit functions to match different specific
>> Array subclasses or groups of types (e.g. integers, floating point,
>> etc.). std::enable_if is helpful (and the various helper templates in
>> arrow/type_traits.h)
>>
>> >
>> > and
>> >
>> > arrow::ArrayDataVisitor
>>
>> Here's an example (didn't compile this, but hopefully this gives the idea)
>>
>> struct BooleanValueVisitor {
>>   int64_t num_true = 0;
>>   int64_t num_null = 0;
>>
>>   Status VisitNull() {
>>     ++num_null;
>>     return Status::OK();
>>   }
>>
>>   Status VisitValue(bool value) {
>>     if (value) ++num_true;
>>     return Status::OK();
>>   }
>> };
>>
>>
>> Status VisitBooleanValues(const Array& arr) {
>>   BooleanValueVisitor visitor;
>>   return ArrayDataVisitor<BooleanType>::Visit(*arr.data(), &visitor);
>> }
>>
>> If you have a type-parameterized visitor, then you could have
>>
>> template <typename ArrowType>
>> Status VisitArrayValues(const Array& arr) {
>>   MyValueVisitor<ArrowType> visitor;
>>   return ArrayDataVisitor<ArrowType>::Visit(*arr.data(), &visitor);
>> }
>>
>> (FWIW, we developed ArrayDataVisitor primarily for internal library
>> use and not as a public API)
>>
>> I would personally try to first use VisitArrayInline if at all
>> possible since it is simpler
>>
>> >
>> > I have a struct:
>> >
>> > Struct AnArrayUser
>> > {
>> >      template <typename T> arrow::Status Visit(const T &a)
>> >      {
>> >            // How to invoke ArrayDataVisitor?
>> >      }
>> >
>> >      void Use(const arrow::Array &a) {arrow::VisitArrayInline(a, this);}
>> >
>> >
>> >      arrow::Status VisitNull() {return arrow::Status::OK();}
>> >      template <class T> arrow::Status VisitValue(T val) {return 
>> > arrow::Status::OK();}
>> > };
>> >
>> > Which appears to have it's "Use" method called appropriately. But inside 
>> > of the Visit method I have so far been unable to find the incantation to 
>> > make a call through the ArrayDataVisitor. I've tried several variations of:
>> >
>> > arrow::ArrayDataVisitor<typename T::TypeClass>::Visit(*(array.data()), 
>> > this);
>> >
>> > at the // How to .. line above but can't seem to get it to work. I'm sure 
>> > I just have some fundamental misunderstanding of how this is supposed to 
>> > work. Can someone give me some guidance?
>> >
>> > Matt
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Wes McKinney <wesmck...@gmail.com>
>> > Sent: Wednesday, January 22, 2020 12:03 PM
>> > To: user@arrow.apache.org
>> > Subject: Re: Converting clickhouse column to arrow array
>> >
>> > If you search for "VisitTypeInline" or "VisitArrayInline" in the C++ 
>> > codebase you can find numerous examples of where this is used
>> >
>> > On Wed, Jan 22, 2020 at 10:58 AM Thomas Buhrmann 
>> > <thomas.buehrm...@gmail.com> wrote:
>> > >
>> > > Hi,
>> > > I was looking for something similar, but didn't find a good example in 
>> > > the docs or the source code showing how to use the visitor pattern. It 
>> > > would be great, e.g., to have an example similar to the "Row to columnar 
>> > > conversion", showing a templated way to read arrow columns into C++ 
>> > > vectors using the visitor pattern, and without implementing a separate 
>> > > reader function for each arrow type. Would that be possible?
>> > >
>> > > Many thanks,
>> > > Thomas
>> > >
>> > > On Wed, 22 Jan 2020 at 17:13, Wes McKinney <wesmck...@gmail.com> wrote:
>> > >>
>> > >> hi Matt,
>> > >>
>> > >> I recommend you use the visitor pattern combined with the
>> > >> arrow::TypeTraits that we provide
>> > >>
>> > >> https://clicktime.symantec.com/38JEFUTGByJzrxbCs1aM2Mn7Vc?u=https%3A%
>> > >> 2F%2Fgithub.com%2Fapache%2Farrow%2Fblob%2Fmaster%2Fcpp%2Fsrc%2Farrow%
>> > >> 2Ftype_traits.h
>> > >>
>> > >> You'll need to provide a compile-time mapping from Clickhouse types
>> > >> to Arrow types, but then you can statically access the correct
>> > >> builder type at compile time
>> > >>
>> > >> using ArrowType = typename CHToArrowType<CHType>::ArrowType; using
>> > >> BuilderType = typename TypeTraits<ArrowType>::BuilderType;
>> > >>
>> > >> ...
>> > >>
>> > >> or similar. In cases where the exported Clickhouse data does not have
>> > >> an associated AppendValues method in Arrow you may have to write a
>> > >> special case (please open JIRA issues if you think there should be
>> > >> more AppendValues methods)
>> > >>
>> > >> Thanks
>> > >>
>> > >> On Wed, Jan 22, 2020 at 7:44 AM Calder, Matthew 
>> > >> <mcal...@xbktrading.com> wrote:
>> > >> >
>> > >> > Hi,
>> > >> >
>> > >> >
>> > >> >
>> > >> > I am interfacing arrow to a Clickhouse database using their c++ 
>> > >> > client. Both arrow and CH have generic array-like classes with the 
>> > >> > element data type internalized. Ideally, I would like to be able to 
>> > >> > write something like:
>> > >> >
>> > >> >
>> > >> >
>> > >> > arrow::Array a = SomeConversionInvocation(clickhouse::Column c);
>> > >> >
>> > >> >
>> > >> >
>> > >> > Where the array and column have the same element type (int, double, 
>> > >> > string, …) but the code is generic to the specific type.
>> > >> >
>> > >> >
>> > >> >
>> > >> > I can do this by explicitly handling specific types through template 
>> > >> > specialization but I thought that since arrow already has pretty 
>> > >> > generic type handling through its templates, and clickhouse also has 
>> > >> > similar capability there ought to be a more seamless way to do the 
>> > >> > conversion. Zero copy would probably be a lot to ask, but something 
>> > >> > short of template specializations for every type is what I am aiming 
>> > >> > for.
>> > >> >
>> > >> >
>> > >> >
>> > >> > I currently do explicit type specialization. For example I have 
>> > >> > functions like:
>> > >> >
>> > >> >
>> > >> >
>> > >> > inline std::shared_ptr<arrow::Array> makeArray(const
>> > >> > std::vector<double> &v)
>> > >> >
>> > >> > {
>> > >> >
>> > >> >     arrow::DoubleBuilder builder;
>> > >> >
>> > >> >     builder.AppendValues(v);
>> > >> >
>> > >> >     std::shared_ptr<arrow::Array> array;
>> > >> >
>> > >> >     builder.Finish(&array);
>> > >> >
>> > >> >     return array;
>> > >> >
>> > >> > }
>> > >> >
>> > >> >
>> > >> >
>> > >> > inline std::shared_ptr<arrow::Array> makeArray(const
>> > >> > std::vector<int> &v)
>> > >> >
>> > >> > {
>> > >> >
>> > >> >     arrow::Int32Builder builder;
>> > >> >
>> > >> >     builder.AppendValues(v);
>> > >> >
>> > >> >     std::shared_ptr<arrow::Array> array;
>> > >> >
>> > >> >     builder.Finish(&array);
>> > >> >
>> > >> >     return array;
>> > >> >
>> > >> > }
>> > >> >
>> > >> >
>> > >> >
>> > >> > Which I suspect is unnecessarily explicit. Is there a more generic 
>> > >> > way of handling the variety of underlying array element data types 
>> > >> > when constructing arrow::Array objects? And can someone point me to 
>> > >> > examples that interface arrow to another similarly generically typed 
>> > >> > library (doesn’t have to be clickhouse). Thanks for any guidance.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Matt
>> > >> >
>> > >> >
>> > >> >
>> > >> >
>> > >> > The information contained in this e-mail may be confidential and is 
>> > >> > intended solely for the use of the named addressee.
>> > >> >
>> > >> > Access, copying or re-use of the e-mail or any information contained 
>> > >> > therein by any other person is not authorized.
>> > >> >
>> > >> > If you are not the intended recipient please notify us immediately by 
>> > >> > returning the e-mail to the originator.
>> > >> >
>> > >> > Disclaimer Version MB.US.1
>> >
>> > The information contained in this e-mail may be confidential and is 
>> > intended solely for the use of the named addressee.
>> >
>> > Access, copying or re-use of the e-mail or any information contained 
>> > therein by any other person is not authorized.
>> >
>> > If you are not the intended recipient please notify us immediately by 
>> > returning the e-mail to the originator.
>> >
>> > Disclaimer Version MB.US.1

Reply via email to