jorisvandenbossche commented on code in PR #43537:
URL: https://github.com/apache/arrow/pull/43537#discussion_r1771296020
##########
python/pyarrow/_flight.pyx:
##########
@@ -704,7 +704,7 @@ cdef class FlightEndpoint(_Weakrefable):
cdef:
CFlightEndpoint endpoint
- def __init__(self, ticket, locations):
+ def __init__(self, ticket, locations, expiration_time=None,
app_metadata=""):
Review Comment:
```suggestion
def __init__(self, ticket, locations, TimestampScalar
expiration_time=None, app_metadata=""):
```
If we require such a scalar, I would either enforce that through cython in
the signature, or otherwise add a check in the code below that ensures a proper
values is passed. For example, right now if you pass a stdlib datetime, you get
"'datetime.datetime' object has no attribute 'cast'"
##########
python/pyarrow/_flight.pyx:
##########
@@ -713,6 +713,12 @@ cdef class FlightEndpoint(_Weakrefable):
the ticket needed to access this flight
locations : list of string URIs
locations where this flight is available
+ expiration_time : TimestampScalar optional, default None
Review Comment:
```suggestion
expiration_time : TimestampScalar, default None
```
Let's put either `, optional` or either `, default None`, but not both
##########
python/pyarrow/_flight.pyx:
##########
@@ -887,6 +933,23 @@ cdef class FlightInfo(_Weakrefable):
"""The size in bytes of the data in this flight, or -1 if unknown."""
return self.info.get().total_bytes()
+ @property
+ def ordered(self):
+ """Whether endpoints are in the same order as the data."""
+ return self.info.get().ordered()
+
+ @property
+ def app_metadata(self):
+ """
+ Application-defined opaque metadata.
+
+ There is no inherent or required relationship between this and the
app_metadata fields in the FlightEndpoints
+ or resulting FlightData messages. Since this metadata is
application-defined, a given application could define
+ there to be a relationship, but there is none required by the spec.
Review Comment:
Can you wrap this text in max 80 chars wide?
##########
python/pyarrow/_flight.pyx:
##########
@@ -736,16 +742,48 @@ cdef class FlightEndpoint(_Weakrefable):
CLocation.Parse(tobytes(location)).Value(&c_location))
self.endpoint.locations.push_back(c_location)
+ if expiration_time is not None:
+ self.endpoint.expiration_time =
TimePoint_to_system_time(TimePoint_from_ns(
+ expiration_time.cast(timestamp("ns")).value))
+
+ self.endpoint.app_metadata = tobytes(app_metadata)
+
@property
def ticket(self):
"""Get the ticket in this endpoint."""
return Ticket(self.endpoint.ticket.ticket)
@property
def locations(self):
+ """Get locations where this flight is available."""
return [Location.wrap(location)
for location in self.endpoint.locations]
+ @property
+ def expiration_time(self):
+ """Get the expiration time of this stream.
+
+ If present, clients may assume they can retry DoGet requests.
+ Otherwise, clients should avoid retrying DoGet requests.
+
+ """
+ cdef:
+ int64_t time_since_epoch
+ const char* UTC = "UTC"
+ shared_ptr[CTimestampType] time_type =
make_shared[CTimestampType](TimeUnit.TimeUnit_NANO, UTC)
+ shared_ptr[CTimestampScalar] shared
+ if self.endpoint.expiration_time.has_value():
+ time_since_epoch = TimePoint_to_ns(
+
TimePoint_from_system_time(self.endpoint.expiration_time.value()))
+ shared = make_shared[CTimestampScalar](time_since_epoch, time_type)
+ return Scalar.wrap(<shared_ptr[CScalar]> shared)
Review Comment:
```suggestion
return scalar(time_since_epoch, pa.timestamp("ns", "UTC"))
```
This is a bit simpler (I don't think the slight overhead is an issue for
this API). This also avoids you have to define the `time_type` above
##########
python/pyarrow/src/arrow/python/datetime.h:
##########
@@ -144,6 +144,16 @@ inline TimePoint TimePoint_from_ns(int64_t val) {
return TimePoint(TimePoint::duration(val));
}
+ARROW_PYTHON_EXPORT
+inline std::chrono::system_clock::time_point
TimePoint_to_system_time(TimePoint val) {
+ return
std::chrono::time_point_cast<std::chrono::system_clock::duration>(val);
Review Comment:
Just to understand this code: what does this exactly do? Because my reading
is that `TimePoint` type is also already an alias for some `time_point` ?
##########
python/pyarrow/includes/libarrow.pxd:
##########
@@ -277,6 +277,7 @@ cdef extern from "arrow/api.h" namespace "arrow" nogil:
cdef cppclass CTimestampType" arrow::TimestampType"(CFixedWidthType):
CTimestampType(TimeUnit unit)
+ CTimestampType(TimeUnit unit, const c_string& timezone)
Review Comment:
If you take my above suggestion, this is not needed anymore I think
##########
python/pyarrow/src/arrow/python/datetime.h:
##########
@@ -144,6 +144,16 @@ inline TimePoint TimePoint_from_ns(int64_t val) {
return TimePoint(TimePoint::duration(val));
}
+ARROW_PYTHON_EXPORT
+inline std::chrono::system_clock::time_point
TimePoint_to_system_time(TimePoint val) {
+ return
std::chrono::time_point_cast<std::chrono::system_clock::duration>(val);
Review Comment:
I see a comment in the tests about `system_clock::duration` being different
depending on the platform, while `TimePoint` here is defined to always be
nanoseconds?
So for the flight API we need to system-dependent resolution?
If that's the reason for adding those APIs, could you add a comment briefly
explaining what those functions do and why we need them?
##########
python/pyarrow/_flight.pyx:
##########
@@ -855,10 +895,14 @@ cdef class FlightInfo(_Weakrefable):
the descriptor for this flight.
endpoints : list of FlightEndpoint
a list of endpoints where this flight is available.
- total_records : int
- the total records in this flight, or -1 if unknown
- total_bytes : int
- the total bytes in this flight, or -1 if unknown
+ total_records : int optional, default None
+ the total records in this flight, -1 or None if unknown.
+ total_bytes : int optional, default None
+ the total bytes in this flight, -1 or None if unknown.
+ ordered : boolean optional, default False
Review Comment:
```suggestion
ordered : boolean, default False
```
##########
python/pyarrow/_flight.pyx:
##########
@@ -713,6 +713,12 @@ cdef class FlightEndpoint(_Weakrefable):
the ticket needed to access this flight
locations : list of string URIs
locations where this flight is available
+ expiration_time : TimestampScalar optional, default None
+ Expiration time of this stream. If present, clients may assume
+ they can retry DoGet requests. Otherwise, clients should avoid
+ retrying DoGet requests.
+ app_metadata : bytes or str optional, default ""
Review Comment:
```suggestion
app_metadata : bytes or str, default ""
```
(to follow the same change as you did further down below)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]