JackieTien97 commented on code in PR #816:
URL: https://github.com/apache/tsfile/pull/816#discussion_r3251951044


##########
python/tsfile/dataset/reader.py:
##########
@@ -403,6 +526,60 @@ def read_series_by_row(
             return timestamp_parts[0], value_parts[0]
         return np.concatenate(timestamp_parts), np.concatenate(value_parts)
 
+    def _read_series_by_row_tree(
+        self, device_path: str, field_name: str, offset: int, limit: int
+    ) -> Tuple[np.ndarray, np.ndarray]:
+        """Tree-model row read: scan on-tree result, filter device, apply 
offset/limit."""
+        target_path_segments = device_path.split(".")
+        # +1 because cwrapper prepends the root as an extra col_i cell.
+        expected_path_len = max(
+            len(t.tag_columns) for t in self._catalog.table_entries
+        ) + 1
+        timestamps = []
+        values = []
+        skipped = 0
+        with self._reader.query_table_on_tree([field_name]) as result_set:

Review Comment:
   Both `_read_series_by_row_tree` and `_read_arrow_tree` do a full 
`query_table_on_tree` scan over the entire file and then client-side filter by 
`target_path_segments`. For a file with many devices, this means reading every 
row to extract data for a single device.
   
   This is acknowledged in the PR description as a cwrapper limitation 
workaround, which is fine for now — but worth flagging for future readers: this 
is **O(total_rows)** per device, so an aligned read across N devices in the 
same file becomes **O(N × total_rows)**. The comments at the call-site document 
the "why" well; a brief `# O(total_rows) — see PR #816 for cwrapper 
limitations` here would also help future profilers find the hot path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to