[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #12: Add metadata builder functions

2022-08-05 Thread GitBox
lidavidm commented on code in PR #12: URL: https://github.com/apache/arrow-nanoarrow/pull/12#discussion_r939160612 ## src/nanoarrow/nanoarrow.h: ## @@ -261,6 +261,24 @@ ArrowErrorCode ArrowMetadataGetValue(const char* metadata, const char* key,

[GitHub] [arrow-nanoarrow] paleolimbot merged pull request #10: Implement bitmap setters, getters, and element-wise builder

2022-08-05 Thread GitBox
paleolimbot merged PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-nanoarrow] paleolimbot closed issue #4: Implement bitmap helpers

2022-08-05 Thread GitBox
paleolimbot closed issue #4: Implement bitmap helpers URL: https://github.com/apache/arrow-nanoarrow/issues/4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #14: Owning/mutable `struct ArrowArray`

2022-08-05 Thread GitBox
paleolimbot opened a new pull request, #14: URL: https://github.com/apache/arrow-nanoarrow/pull/14 Fixes #5 by implementing an Array whose buffer lifecycle is handled by `struct ArrowBuffer`. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #14: Owning/mutable `struct ArrowArray`

2022-08-05 Thread GitBox
lidavidm commented on code in PR #14: URL: https://github.com/apache/arrow-nanoarrow/pull/14#discussion_r939166568 ## src/nanoarrow/typedefs_inline.h: ## @@ -165,6 +212,20 @@ struct ArrowBitmap { int64_t size_bits; }; +/// \brief A structure used as the private data

[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #10: Implement bitmap setters, getters, and element-wise builder

2022-08-05 Thread GitBox
paleolimbot commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1206514130 I see...I'd been using it to simplify the append process, but the right thing to do is to properly bitpack-as-you-append (which is now implemented) so that the `ArrowBufferXXX()`

[GitHub] [arrow-adbc] dependabot[bot] opened a new pull request, #54: Bump postgresql from 42.4.0 to 42.4.1 in /java/driver/jdbc-validation-postgresql

2022-08-06 Thread GitBox
dependabot[bot] opened a new pull request, #54: URL: https://github.com/apache/arrow-adbc/pull/54 Bumps [postgresql](https://github.com/pgjdbc/pgjdbc) from 42.4.0 to 42.4.1. Changelog Sourced from https://github.com/pgjdbc/pgjdbc/blob/master/CHANGELOG.md;>postgresql's changelog.

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942434862 ## src/nanoarrow/utils_inline.h: ## @@ -26,6 +26,114 @@ extern "C" { #endif +static inline void ArrowLayoutInit(struct ArrowLayout* layout, +

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942417008 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942442675 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942439274 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942446454 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942472594 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942437669 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942470780 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942446356 ## src/nanoarrow/array_inline.h: ## @@ -0,0 +1,146 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942431188 ## src/nanoarrow/typedefs_inline.h: ## @@ -166,6 +166,24 @@ enum ArrowType { NANOARROW_TYPE_INTERVAL_MONTH_DAY_NANO }; +/// \brief Functional types of

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942447804 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#issuecomment-1210685336 Just a note that I'm reworking this interface based on some thoughts after working with this for a day or so: - Instead of copying all the buffer/bitmap methods for the

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942441450 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-adbc] zeroshade commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211012442 at a minimum it would be good to be able to at least know the *number* of expected inputs even if the schema isn't knowable. Maybe having two values? an integer indicating

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942753583 ## src/nanoarrow/array_inline.h: ## @@ -0,0 +1,246 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211134073 How about this? The parameters are always encoded as a schema, but unknown types are represented as just NullType. Avoids having lots of optional things/multiple calls.

[GitHub] [arrow-nanoarrow] codecov-commenter commented on pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
codecov-commenter commented on PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#issuecomment-1211146044 #

[GitHub] [arrow-adbc] zeroshade opened a new issue, #60: Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade opened a new issue, #60: URL: https://github.com/apache/arrow-adbc/issues/60 If available, it would be great to be able to retrieve any information about parameter binding that is available. Some potential information that *might* be available: * Number of expected

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210989539 Also, possibly the driver manager could define execute-with-result-set and execute-with-rows-affected in terms of the generic execute + generic getters to retrieve the affected

[GitHub] [arrow-adbc] zeroshade opened a new issue, #59: Provide a "just query" method

2022-08-10 Thread GitBox
zeroshade opened a new issue, #59: URL: https://github.com/apache/arrow-adbc/issues/59 For the common case of executing a single SQL string, let's have a method on the connection object for executing the query directly without the need for an intermediate Statement object -- This is an

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210994922 CC @pitrou, @hannes, @krlmlr if you have opinions here? @lwhite1 had the same feedback about executeQuery/execute in Java last month. So for consistency a query method

[GitHub] [arrow-adbc] zeroshade commented on issue #59: Provide a "just query" method

2022-08-10 Thread GitBox
zeroshade commented on issue #59: URL: https://github.com/apache/arrow-adbc/issues/59#issuecomment-1210917005 With this, it might make sense for the `AdbcStatement` object to *only* represent a prepared statement and place the `Prepare` method on the Connection rather than on the

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211081338 That makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-adbc] zeroshade opened a new issue, #61: Simplify Execute and Query interface

2022-08-10 Thread GitBox
zeroshade opened a new issue, #61: URL: https://github.com/apache/arrow-adbc/issues/61 Rather than the separate `Execute` / `GetStream` functions, it might be better to follow something similar to FlightSQL's interface or Go's `database/sql` API. Have two functions: * Execute

[GitHub] [arrow-adbc] lidavidm commented on issue #59: Provide a "just query" method

2022-08-10 Thread GitBox
lidavidm commented on issue #59: URL: https://github.com/apache/arrow-adbc/issues/59#issuecomment-1210982209 I think this makes sense to provide as a convenience, but maybe not as the only method. The separate Statement object still lets us configure any options in an ABI-compatible way

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1210983818 Flight SQL provides this. I think this makes sense, but yeah, something like `SELECT ?, ?` is going to be dubious. I don't know if there's a great way of indicating that, though.

[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot opened a new pull request, #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17 It turns out this is really annoying to do otherwise! Declaring a variable of an appropriate type gets verbose when switching on type, and it sounds like these functions might be useful for

[GitHub] [arrow-adbc] zeroshade commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
zeroshade commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1210860838 Two more gaps to add: * Retrieve the last inserted id for inserts into an auto-increment table * Retrieve the number of rows affected by the last query (number inserted /

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210987487 I think it may still have sense to have a generic Execute to ease compatibility with APIs that do not differentiate between the types of queries (and note JDBC has all three!), but

[GitHub] [arrow-adbc] lidavidm merged pull request #58: [C][Python] Add options to control append vs create for bulk ingest

2022-08-10 Thread GitBox
lidavidm merged PR #58: URL: https://github.com/apache/arrow-adbc/pull/58 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#discussion_r942812080 ## src/nanoarrow/buffer_test.cc: ## @@ -160,3 +160,31 @@ TEST(BufferTest, BufferTestError) { ArrowBufferReset(); } + +TEST(BufferTest,

[GitHub] [arrow-adbc] zeroshade commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211193521 Seems good to me! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211235028 Punting on paramstyle and last inserted ID, but adding row count and current catalog: ```diff commit 50b2e40d727c0a51029d7f5506c0696b3a19a3b9 Author: David Li Date:

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211236715 Returning strings from a C API is a bit annoying and I'm not sure whether this is preferable, or if we want to go with an ODBC-style API (pass a caller-allocated buffer and length

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211184131 So looking into it - rowcount is easy to bind, but hard to support (lots of things don't support it or only support it for inserts) - that's OK. Flight SQL only exposes it for

[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot commented on PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#issuecomment-1211464135 Ok - this is a first pass at #8 that implements the functions needed to make "build by buffer" a thing. The `ArrowBufferAppendInt8()` family of functions helps make code that

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#discussion_r943029428 ## src/nanoarrow/buffer_test.cc: ## @@ -160,3 +160,31 @@ TEST(BufferTest, BufferTestError) { ArrowBufferReset(); } + +TEST(BufferTest,

[GitHub] [arrow-nanoarrow] paleolimbot merged pull request #17: Buffer element appenders

2022-08-11 Thread GitBox
paleolimbot merged PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-adbc] zeroshade commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-11 Thread GitBox
zeroshade commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1212115628 Is the idea that `RowCount` would do double duty as the number of rows in a result set OR the number of rows affected by an update/insert? Given the lack of reliable support,

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-11 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1212119635 > Is the idea that `RowCount` would do double duty as the number of rows in a result set OR the number of rows affected by an update/insert? Yeah, I don't see a reason to have

[GitHub] [arrow-adbc] zeroshade commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-11 Thread GitBox
zeroshade commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1212124555 Sounds good to me > Also, I would argue these sorts of use cases are mostly out of scope, though that's mostly my assumption. I agree, seems fine for that to be out of

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety guarantees

2022-08-14 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1214417469 I'll chime in from the Go database/sql package: * A Connection Object is assumed to be Stateful and will not be used concurrently by multiple Goroutines * Connections have

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-14 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1214424587 Right, and on the other hand, databases like SQLite have no reliable way to get the info. But APIs like JDBC, Python DBAPI, and Go's database API expose standard ways to get last

[GitHub] [arrow-adbc] krlmlr commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-14 Thread GitBox
krlmlr commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1214423568 Last inserted IDs (or the results of computed columns in general, for that matter) can be obtained with the `RETURNING` syntax for most databases, SQL Server has `OUTPUT` . This seems

[GitHub] [arrow-adbc] krlmlr commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-14 Thread GitBox
krlmlr commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1214402390 Just to be sure we're on the same page: - a "query" is a single SQL string that can return a result set but doesn't have to - a "statement" is the result of preparing an SQL

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety guarantees

2022-08-14 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1214418597 Thanks, I think that aligns with what I am basically assuming so far: no concurrent access (though maybe a particular driver can relax this, e.g. Flight SQL), but also no guarantees

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-14 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1214419966 > Just to be sure we're on the same page: > > * a "query" is a single SQL string that can return a result set but doesn't have to > > * a "statement" is the

[GitHub] [arrow-adbc] paleolimbot commented on issue #64: [Format] Formalize thread safety guarantees

2022-08-15 Thread GitBox
paleolimbot commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1214944399 > no guarantees on thread identity This is something I've noticed about calls to the `struct ArrowArrayStream` methods from both Arrow and DuckDB (sometimes they happen on

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215042875 @paleolimbot in that case, wouldn't it be the responsibility of the R/ADBC bridge to manage the threading? Requiring that all calls to ADBC happen from the same thread is rather

[GitHub] [arrow-adbc] pitrou commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
pitrou commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215072668 > @paleolimbot in that case, wouldn't it be the responsibility of the R/ADBC bridge to manage the threading? That sounds reasonable to me (if at all possible). I presume it is

[GitHub] [arrow-adbc] paleolimbot commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
paleolimbot commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215092186 > I presume it is possible to queue R function calls from another thread? It's an absolute nightmare to do safely without the consumer evaluating the function call in a

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215092467 So digging around, I would probably vote that for ADBC, we roughly follow the JDBC/ODBC concurrency/thread safety guarantees. Here, a statement is just a handle/object to

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215092911 > I just bring it up in case limiting method calls to the thread that created the connection is not in fact painful, and with the vague feeling that the sorry sod who has to work

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215148897 It's still worth thinking about, even if the answer is no :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215277102 > An individual statement can be used multiple times, but result sets cannot be read concurrently (that is: executing a statement invalidates prior result sets) Are we

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215437455 Hmm, but that might impact something like SQLite or DuckDB - I'd have to test how DuckDB behaves here, though. In general, my understanding of most APIs is that prepared

[GitHub] [arrow-adbc] lidavidm opened a new pull request, #66: [Format] Clarify thread safety/concurrency guarantees

2022-08-15 Thread GitBox
lidavidm opened a new pull request, #66: URL: https://github.com/apache/arrow-adbc/pull/66 Add documentation around thread safety/concurrency/overlapping usage of a single object. Sets up Doxygen. Fixes #64. -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215574770 JDBC/ODBC and Flight SQL/Go are unfortunately at odds here, though. I'm not sure if there's a great way to express that Flight SQL lets you do this without penalty without also

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215593890 @lidavidm I was writing that response and hten stepped away for a meeting and came back to all your comments haha. but TL;DR: looks like under the hood the database/sql package in

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215591703 > In general, my understanding of most APIs is that prepared statements aren't really set up to benefit concurrent execution, only repeated execution from a single logical chain of

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215400906 > I am only talking about a single query, so this is sort of irrelevant to Go. For Go, I would expect each call to Query to initialize and use a new AdbcStatement, at which point

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215409446 Flight SQL would be able to do this easily, but libpq can't do this at all, for instance - that's the balance I'm trying to strike here. (Also even for Flight SQL, I'm not certain

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215587882 Possibly not, given that most APIs don't (or, they materialize up front). I'm also still a little skeptical whether Flight SQL truly allows this, or whether it's just not been

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215594902 I should've read the Go docs a little more closely :sweat_smile: in that case I'll keep the linked PR as-is, if it looks reasonable. Thanks for the help! -- This is an automated

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215589955 Hmm. Go's docs state this for Stmt: > When the Stmt needs to execute on a new underlying connection, it will prepare itself on the new connection automatically.

[GitHub] [arrow-adbc] paleolimbot commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
paleolimbot commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215146633 No worries then - ignore me! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-adbc] lidavidm commented on pull request #66: [Format] Clarify thread safety/concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on PR #66: URL: https://github.com/apache/arrow-adbc/pull/66#issuecomment-1215532046 I guess DuckDB materializes the results fully when you query so that's not a worry (we could support Go's behavior) -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215562560 It will still create difficulties for drivers that wish to wrap JDBC/ODBC, which explicitly disallow this, though. Those will have to materialize all results in memory. -- This

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215470298 Draft of some clarifying text in #66 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-adbc] lidavidm commented on pull request #66: [Format] Clarify thread safety/concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on PR #66: URL: https://github.com/apache/arrow-adbc/pull/66#issuecomment-1215469871 TODO here: reconcile with Go's prepared statement behavior in #64 and see what DuckDB does in the described scenario -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215342377 > Are we referring to result sets from _different queries_ or only putting a limitation on result sets from a single prepared statement? So to be clear, "statement" here

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215542812 Long story short @zeroshade I'll tweak #66 to clarify that result sets are not invalidated except by closing the statement, but to caution that some drivers may have to materialize

[GitHub] [arrow-adbc] pitrou commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
pitrou commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215567684 Having to materialize all results instead of streaming them sounds a rather major pitfall. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-adbc] pitrou commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
pitrou commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215580804 > I'm not sure if there's a great way to express that Flight SQL lets you do this [...] Is it important to express it? -- This is an automated message from the Apache Git

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-15 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1215630552 Ok, so how does this sound (here, I'm ignoring #59 provide a "just query" method): Remove `AdbcStatementGetStream` Change `AdbcStatementExecute` to

[GitHub] [arrow-nanoarrow] wesm opened a new issue, #20: Consolidate implementation into files, or implement an amalgamation step?

2022-08-15 Thread GitBox
wesm opened a new issue, #20: URL: https://github.com/apache/arrow-nanoarrow/issues/20 In the spirit of making vendoring simpler for users, I'm wondering what we can do to reduce the total number of files in the project. In very large projects that are intending to be vendored (like DuckDB

[GitHub] [arrow-nanoarrow] lidavidm opened a new issue, #21: Demonstrate vendoring + symbol renaming

2022-08-15 Thread GitBox
lidavidm opened a new issue, #21: URL: https://github.com/apache/arrow-nanoarrow/issues/21 Just to prove our claim of being easily embeddable, and to make sure that things work if multiple versions of nanoarrow end up in a binary (e.g. through two different dependencies) -- This is an

[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #65: [C] Basic libpq-based driver

2022-08-15 Thread GitBox
lidavidm commented on code in PR #65: URL: https://github.com/apache/arrow-adbc/pull/65#discussion_r946050839 ## c/drivers/postgres/statement.cc: ## @@ -0,0 +1,283 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

[GitHub] [arrow-adbc] zeroshade commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-15 Thread GitBox
zeroshade commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1215736580 That seems reasonable to me @lidavidm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-adbc] lidavidm commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
lidavidm commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215412769 I suppose we could keep the ArrowArrayStream as independent, and that would require `Execute` to block and accumulate all results in the case of libpq (which is also probably

[GitHub] [arrow-adbc] zeroshade commented on issue #64: [Format] Formalize thread safety and concurrency guarantees

2022-08-15 Thread GitBox
zeroshade commented on issue #64: URL: https://github.com/apache/arrow-adbc/issues/64#issuecomment-1215600926 @lidavidm Don't beat yourself up, it wasn't in the docs haha. I had to read the actual source code to figure that out :smile: -- This is an automated message from the Apache Git

[GitHub] [arrow-nanoarrow] lidavidm commented on issue #21: Demonstrate vendoring + symbol renaming

2022-08-15 Thread GitBox
lidavidm commented on issue #21: URL: https://github.com/apache/arrow-nanoarrow/issues/21#issuecomment-1216068768 jemalloc lets you do it as well (it's used in Arrow C++), though I'm not sure how it works there. (In that case, symbols are renamed, but still accessible.) I think this

[GitHub] [arrow-nanoarrow] wesm commented on issue #20: Consolidate implementation into files, or implement an amalgamation step?

2022-08-15 Thread GitBox
wesm commented on issue #20: URL: https://github.com/apache/arrow-nanoarrow/issues/20#issuecomment-1216074095 That seems odd that VSCode would struggle with large files (should this be reported as a bug?). It does not seem like a reason to compromise the design of the library. -- This

[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #20: Consolidate implementation into files, or implement an amalgamation step?

2022-08-15 Thread GitBox
paleolimbot commented on issue #20: URL: https://github.com/apache/arrow-nanoarrow/issues/20#issuecomment-1216058824 I think we could get it down to something like that...off the top of my head I can see 2 .h and 3 .c making sense (nanoarrow.h, nanarrow_inline.h, array.c, schema.c,

[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #21: Demonstrate vendoring + symbol renaming

2022-08-15 Thread GitBox
paleolimbot commented on issue #21: URL: https://github.com/apache/arrow-nanoarrow/issues/21#issuecomment-1216064759 I was thinking that an `examples/` directory should exist...this should definitely be one of them! With respect to symbol renaming, I've only seen this in H3:

[GitHub] [arrow-nanoarrow] lidavidm commented on issue #18: Add element-wise getters?

2022-08-11 Thread GitBox
lidavidm commented on issue #18: URL: https://github.com/apache/arrow-nanoarrow/issues/18#issuecomment-1212365892 Hmm, wouldn't you just validate the lengths once in `ArrrayViewInit`, and then the getters could just validate the given index against the logical length already in ArrowArray?

[GitHub] [arrow-julia] bdklahn opened a new issue, #330: Show Map example in documentation?

2022-08-11 Thread GitBox
bdklahn opened a new issue, #330: URL: https://github.com/apache/arrow-julia/issues/330 I'm so glad someone implemented Arrow for Julia. Thanks! And I think the intro to the User Manual is the clearest I've come across to help understand the what and why of Arrow. It looks

[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #18: Add element-wise getters?

2022-08-11 Thread GitBox
paleolimbot commented on issue #18: URL: https://github.com/apache/arrow-nanoarrow/issues/18#issuecomment-1212370668 I was separating the `ArrowLayout` bit into its own PR today anyway and will play with it a bit...maybe the `ArrowBufferView is part of the `ArrayView` so that it's only

[GitHub] [arrow-nanoarrow] lidavidm opened a new issue, #18: Add element-wise getters?

2022-08-11 Thread GitBox
lidavidm opened a new issue, #18: URL: https://github.com/apache/arrow-nanoarrow/issues/18 This isn't strictly necessary - but I wonder what you'd think of having inline getters, things like ```c int ArrowArrayIsSet(struct ArrowArray* array, int64_t i) { // TODO: unions

[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #18: Add element-wise getters?

2022-08-11 Thread GitBox
paleolimbot commented on issue #18: URL: https://github.com/apache/arrow-nanoarrow/issues/18#issuecomment-1212348074 I think that makes sense! Although it may need some extra information (storage type + layout) to be generic. Maybe: ```c struct ArrowArrayView(struct ArrowArray*

[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #19: ArrowArray consumer buffer helpers

2022-08-11 Thread GitBox
paleolimbot opened a new pull request, #19: URL: https://github.com/apache/arrow-nanoarrow/pull/19 Implements a few types whose goal is to provide access to buffers. This will require a `switch(storage_type)` by the consumer and requires some knowledge of the spec (getters to parallel the

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #19: ArrowArray consumer buffer helpers

2022-08-11 Thread GitBox
lidavidm commented on code in PR #19: URL: https://github.com/apache/arrow-nanoarrow/pull/19#discussion_r943929415 ## src/nanoarrow/utils_inline.h: ## @@ -26,6 +26,115 @@ extern "C" { #endif +static inline void ArrowLayoutInit(struct ArrowLayout* layout, Review Comment:

<    2   3   4   5   6   7   8   9   10   11   >