[
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87565&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87565
]
ASF GitHub Bot logged work on BEAM-3973:
----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Apr/18 14:33
Start Date: 04/Apr/18 14:33
Worklog Time Spent: 10m
Work Description: iemejia commented on a change in pull request #4946:
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179146877
##########
File path:
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
##########
@@ -193,6 +196,52 @@ public void testQuery() throws Exception {
p.run();
}
+ @Test
+ public void testReadAll() throws Exception {
+ DatabaseClient databaseClient =
+ spanner.getDatabaseClient(
+ DatabaseId.of(
+ project, options.getInstanceId(), databaseName));
+
+ List<Mutation> mutations = new ArrayList<>();
+ for (int i = 0; i < 5L; i++) {
+ mutations.add(
+ Mutation.newInsertOrUpdateBuilder(options.getTable())
+ .set("key")
+ .to((long) i)
+ .set("value")
+ .to(RandomUtils.randomAlphaNumeric(100))
+ .build());
+ }
+
+ databaseClient.writeAtLeastOnce(mutations);
+
+ SpannerConfig spannerConfig = SpannerConfig.create()
+ .withProjectId(project)
+ .withInstanceId(options.getInstanceId())
+ .withDatabaseId(databaseName);
+
+ PCollectionView<Transaction> tx =
+ p.apply(
+ SpannerIO.createTransaction()
+ .withSpannerConfig(spannerConfig)
+ .withTimestampBound(TimestampBound.strong()));
+
+ PCollection<Struct> allRecords = p.apply(SpannerIO.read()
+ .withSpannerConfig(spannerConfig)
+ .withBatching(false)
Review comment:
Is there a way to detect that a user is using a non ŕoot partitionable query
without using the right batching flag ? I wonder if it is worth to create a
test for this error case, and if we can find it early on via some call in the
API maybe we should add this to the expand. (I saw a TODO there but not sure if
it is for the same goal).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 87565)
> Allow to disable batch API in SpannerIO
> ---------------------------------------
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.4.0
> Reporter: Mairbek Khadikov
> Assignee: Mairbek Khadikov
> Priority: Major
> Fix For: 2.5.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API
> provides abstractions to scale out reads from Spanner, but it requires the
> query to be root-partitionable. The root-partitionable queries cover majority
> of the use cases, however there are examples when running arbitrary query is
> useful. For example, reading all the table names from the
> information_schema.* and reading the content of those tables in the next
> step.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)