For the record, other people in the Arrow community have discussed
building an adapter for CH

https://issues.apache.org/jira/browse/ARROW-3156

It might be advisable to find others in the CH community who are
interested and build a shared solution -- this work would be welcome
inside Apache Arrow IMHO (and other database interfaces, too).

On Wed, Jan 22, 2020 at 10:12 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> hi Matt,
>
> I recommend you use the visitor pattern combined with the
> arrow::TypeTraits that we provide
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/type_traits.h
>
> You'll need to provide a compile-time mapping from Clickhouse types to
> Arrow types, but then you can statically access the correct builder
> type at compile time
>
> using ArrowType = typename CHToArrowType<CHType>::ArrowType;
> using BuilderType = typename TypeTraits<ArrowType>::BuilderType;
>
> ...
>
> or similar. In cases where the exported Clickhouse data does not have
> an associated AppendValues method in Arrow you may have to write a
> special case (please open JIRA issues if you think there should be
> more AppendValues methods)
>
> Thanks
>
> On Wed, Jan 22, 2020 at 7:44 AM Calder, Matthew <mcal...@xbktrading.com> 
> wrote:
> >
> > Hi,
> >
> >
> >
> > I am interfacing arrow to a Clickhouse database using their c++ client. 
> > Both arrow and CH have generic array-like classes with the element data 
> > type internalized. Ideally, I would like to be able to write something like:
> >
> >
> >
> > arrow::Array a = SomeConversionInvocation(clickhouse::Column c);
> >
> >
> >
> > Where the array and column have the same element type (int, double, string, 
> > …) but the code is generic to the specific type.
> >
> >
> >
> > I can do this by explicitly handling specific types through template 
> > specialization but I thought that since arrow already has pretty generic 
> > type handling through its templates, and clickhouse also has similar 
> > capability there ought to be a more seamless way to do the conversion. Zero 
> > copy would probably be a lot to ask, but something short of template 
> > specializations for every type is what I am aiming for.
> >
> >
> >
> > I currently do explicit type specialization. For example I have functions 
> > like:
> >
> >
> >
> > inline std::shared_ptr<arrow::Array> makeArray(const std::vector<double> &v)
> >
> > {
> >
> >     arrow::DoubleBuilder builder;
> >
> >     builder.AppendValues(v);
> >
> >     std::shared_ptr<arrow::Array> array;
> >
> >     builder.Finish(&array);
> >
> >     return array;
> >
> > }
> >
> >
> >
> > inline std::shared_ptr<arrow::Array> makeArray(const std::vector<int> &v)
> >
> > {
> >
> >     arrow::Int32Builder builder;
> >
> >     builder.AppendValues(v);
> >
> >     std::shared_ptr<arrow::Array> array;
> >
> >     builder.Finish(&array);
> >
> >     return array;
> >
> > }
> >
> >
> >
> > Which I suspect is unnecessarily explicit. Is there a more generic way of 
> > handling the variety of underlying array element data types when 
> > constructing arrow::Array objects? And can someone point me to examples 
> > that interface arrow to another similarly generically typed library 
> > (doesn’t have to be clickhouse). Thanks for any guidance.
> >
> >
> >
> > Matt
> >
> >
> >
> >
> > The information contained in this e-mail may be confidential and is 
> > intended solely for the use of the named addressee.
> >
> > Access, copying or re-use of the e-mail or any information contained 
> > therein by any other person is not authorized.
> >
> > If you are not the intended recipient please notify us immediately by 
> > returning the e-mail to the originator.
> >
> > Disclaimer Version MB.US.1

Reply via email to