[ 
https://issues.apache.org/jira/browse/BEAM-8376?focusedWorklogId=621719&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-621719
 ]

ASF GitHub Bot logged work on BEAM-8376:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Jul/21 20:31
            Start Date: 12/Jul/21 20:31
    Worklog Time Spent: 10m 
      Work Description: BenWhitehead commented on a change in pull request 
#15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668234834



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are 
as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all 
possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific 
sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" 
rel="noopener
+ * noreferrer"
+ * 
href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest";>{@code
+ * show_missing}</a> is needed to access a document. RunQuery and 
PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
+ * PCollection<RunQueryResponse> partitionQueryResponses = 
partitionQueryRequests
+ *     .apply(FirestoreIO.v1().read().partitionQuery().build());
+ * }</pre>
+ *
+ * <pre>{@code
+ * PCollection<RunQueryRequest> runQueryRequests = ...;

Review comment:
       Each of the `PTransform`s off of `FirestoreIO.v1().read()` represent an 
individual RPC which Firestore supports for access of data. Each of them has at 
least one differentiating feature from other similar methods and is justified 
in being present.
   
   1. BatchGet is currently the only way to get documents by their id. Some 
customers do external id management which is then coordinated across several 
systems.
   2. ListCollections is currently the only way in which you can enumerate the 
collections of a document.
   3. ListDocuments is currently the only way in which you can access documents 
which have sub collections but no properties themselves (via `show_missing`)
   4. RunQuery is the primary and most performant way of fetching document by 
some criteria.
   5. PartitionQuery works in conjunction with RunQuery, today only 
CollectionGroup queries are support for partitioning but more query types are 
intended to be supported in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 621719)
    Time Spent: 39h  (was: 38h 50m)

> Add FirestoreIO connector to Java SDK
> -------------------------------------
>
>                 Key: BEAM-8376
>                 URL: https://issues.apache.org/jira/browse/BEAM-8376
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-gcp
>            Reporter: Stefan Djelekar
>            Priority: P3
>          Time Spent: 39h
>  Remaining Estimate: 0h
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to