postgresql): Timestamp write support

via GitHub Wed, 28 Jun 2023 10:49:01 -0700


lidavidm commented on code in PR #861:
URL: https://github.com/apache/arrow-adbc/pull/861#discussion_r1245565413



##########
c/driver/postgresql/statement.cc:
##########
@@ -605,6 +634,9 @@ AdbcStatusCode PostgresStatement::CreateBulkTable(
       case ArrowType::NANOARROW_TYPE_BINARY:
         create += " BYTEA";
         break;
+      case ArrowType::NANOARROW_TYPE_TIMESTAMP:

Review Comment:
   We should probably differentiate between WITH/WITHOUT TIMEZONE?
   
   note that in WITH TIMEZONE, Arrow always stores the underlying value in UTC 
so there's no need for us to do any time zone math (thankfully!)



##########
c/validation/adbc_validation_util.cc:
##########
@@ -141,7 +141,16 @@ int MakeSchema(struct ArrowSchema* schema, const 
std::vector<SchemaField>& field
   CHECK_ERRNO(ArrowSchemaSetTypeStruct(schema, fields.size()));
   size_t i = 0;
   for (const SchemaField& field : fields) {
-    CHECK_ERRNO(ArrowSchemaSetType(schema->children[i], field.type));
+    switch (field.type) {
+      case NANOARROW_TYPE_TIMESTAMP:
+        // TODO: don't hardcode unit here
+        CHECK_ERRNO(ArrowSchemaSetTypeDateTime(schema->children[i], field.type,
+                                               NANOARROW_TIME_UNIT_MICRO,

Review Comment:
   Ah, for here...you could also just write out the code to build the schema by 
hand? This was meant as a convenience to cut down on boilerplate for simple 
schemas. But it was from a very early nanoarrow version, and nowadays I'm not 
sure how much boilerplate it actually saves compared to just writing it out.



##########
c/driver/postgresql/statement.cc:
##########
@@ -337,6 +341,31 @@ struct BindStream {
               param_values[col] = const_cast<char*>(view.data.as_char);
               break;
             }
+            case ArrowType::NANOARROW_TYPE_TIMESTAMP: {
+              int64_t val = 
array_view->children[col]->buffer_views[1].data.as_int64[row];
+              auto unit = bind_schema_fields[col].time_unit;
+
+              // TODO: maybe upstream to nanoarrow as 
ArrowTimeUnitGetMultiplier
+              switch (unit) {
+                case NANOARROW_TIME_UNIT_SECOND:
+                  val *= 1000000;
+                  break;
+                case NANOARROW_TIME_UNIT_MILLI:
+                  val *= 1000;
+                  break;
+                case NANOARROW_TIME_UNIT_MICRO:
+                  break;
+                case NANOARROW_TIME_UNIT_NANO:
+                  val /= 1000;

Review Comment:
   We should handle truncation/overflow here



##########
c/driver/postgresql/statement.cc:
##########
@@ -216,6 +216,10 @@ struct BindStream {
           type_id = PostgresTypeId::kBytea;
           param_lengths[i] = 0;
           break;
+        case ArrowType::NANOARROW_TYPE_TIMESTAMP:
+          type_id = PostgresTypeId::kTimestamp;

Review Comment:
   We should differentiate between with/without timezone?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #861: feat(c/driver/postgresql): Timestamp write support

Reply via email to