Hi, I have to have it like this: ArrowSchema *arrow_schema = new ArrowSchema; ArrowArray *arrow_array = new ArrowArray;
Otherwise there will be a seg fault in c++ code. It’s maybe a requirement for DuckDB. Best regards, Kimmo -- Kimmo Linna Nihtisalontie 3 as 1 02630 ESPOO [email protected] +358 40 590 1074 > On 8. Mar 2023, at 18.10, Weston Pace <[email protected]> wrote: > > I agree with Aldrin. `arrow_schema` has type ArrowSchema*. You are passing > in `&arrow_schema` which has type `ArrowSchema**` but you are casting it to > `duckdb_arrow_schema*` so it at least compiles. You can also simplify things > by getting rid of some of this allocation. > > ``` > ArrowSchema arrow_schema; > duckdb_query_arrow_schema(arrow_result, (duckdb_arrow_schema > *)&arrow_schema); > auto schema = arrow::ImportSchema(&arrow_schema); > auto output_stream = arrow::io::BufferOutputStream::Create(); > auto batch_writer = arrow::ipc::MakeStreamWriter(*output_stream, *schema); > while (true) > { > ArrowArray arrow_array; > duckdb_query_arrow_array(arrow_result, (duckdb_arrow_array > *)&arrow_array); > if (arrow_array.length > 0) > { > (*batch_writer)->WriteRecordBatch((const arrow::RecordBatch > &)**arrow::ImportRecordBatch(&arrow_array, *schema)); > } > else > { > arrow_array.release(&arrow_array); > break; > } > } > auto buffer = (*output_stream)->Finish(); > ``` > > > On Tue, Mar 7, 2023 at 11:26 AM Aldrin <[email protected] > <mailto:[email protected]>> wrote: >> I might be misreading, but do you need to get the address of your object >> pointers, `*arrow_schema` and `*arrow_array`? If they're already pointers, >> I'm not sure you need to get their address and cast that to a pointer. >> otherwise, I don't see anything else that stands out as problematic to me. >> >> You could try to check the Result objects so that you get better error >> information before the segfault occurs; as it stands now you're blindly >> dereferencing. The other thing that might be helpful is that I think you >> don't have to call `new` every time. In all of these instances you are >> allocating memory for the objects, but you could just create them directly >> and then pass references to them. For example: >> >> ArrowSchema arrow_schema; >> duckdb_query_arrow_schema(arrow_result, &arrow_schema); >> ... >> >> Since ArrowSchema seems to be default constructible, I believe the >> declaration alone will do the default construction (`ArrowSchema >> arrow_schema`). But, my cpp knowledge comes and goes, so apologies if those >> suggestions aren't accurate. >> >> Aldrin Montana >> Computer Science PhD Student >> UC Santa Cruz >> >> >> On Sun, Mar 5, 2023 at 8:56 AM Kimmo Linna <[email protected] >> <mailto:[email protected]>> wrote: >>> Hi, >>> >>> I noticed that my previous code worked only for one ArrowArray from DuckDB. >>> I change a code this but this is still working only for one ArrowArray. The >>> second “loop” causes a segmentation fault. Could someone guide me a bit? >>> What should I change to get this work? >>> >>> >>> ArrowSchema *arrow_schema = new ArrowSchema(); >>> duckdb_query_arrow_schema(arrow_result, (duckdb_arrow_schema >>> *)&arrow_schema); >>> auto schema = arrow::ImportSchema(arrow_schema); >>> auto output_stream = arrow::io::BufferOutputStream::Create(); >>> auto batch_writer = arrow::ipc::MakeStreamWriter(*output_stream, *schema); >>> while (true) { >>> ArrowArray *arrow_array = new ArrowArray(); >>> duckdb_query_arrow_array(arrow_result, (duckdb_arrow_array >>> *)&arrow_array); >>> if( arrow_array->length > 0) { >>> (*batch_writer)->WriteRecordBatch((const arrow::RecordBatch&) >>> **arrow::ImportRecordBatch(arrow_array,*schema)); >>> delete arrow_array; >>> }else{ >>> break; >>> } >>> } >>> auto buffer = (*output_stream)->Finish(); >>> Best regards, >>> >>> Kimmo >>> >>> -- >>> Kimmo Linna >>> Nihtisalontie 3 as 1 >>> 02630 ESPOO >>> [email protected] <mailto:[email protected]> >>> +358 40 590 1074 >>> >>> >>>
