lastomato commented on a change in pull request #12721:
URL: https://github.com/apache/beam/pull/12721#discussion_r482673857



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/FhirIO.java
##########
@@ -122,12 +124,24 @@
  * store) This requires each resource to contain a client provided ID. It is 
important that when
  * using import you give the appropriate permissions to the Google Cloud 
Healthcare Service Agent.
  *
+ * <p>ExportGcs This is to export FHIR resources from a FHIR store to Google 
Cloud Storage. The

Review comment:
       I am a bit confused. The purpose of exporting is to load the FHIR 
resources into Dataflow for further processing, thus how the data is loaded 
(e.g. via GCS or BQ) is not important. If users would like to apply 
transformations before sending the data to BQ, they can still export via GCS, 
and then use BigQueryIO to write the data. Did I miss anything?

##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/FhirIO.java
##########
@@ -170,6 +184,21 @@
  * // Alternatively you could use import for high throughput to a new store.
  * FhirIO.Write.Result writeResult =
  *     output.apply("Import FHIR Resources", 
FhirIO.executeBundles(options.getNewFhirStore()));
+ *
+ * // Export FHIR resources to Google Cloud Storage.
+ * String fhirStoreName = ...;
+ * String exportGcsUriPrefix = ...;
+ * FhirIO.ExportGcs.Result exportResult =

Review comment:
       In that case, you probably won't need to wrap it in a `Result` class, 
since you can return a `PCollection` directly.

##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/FhirIO.java
##########
@@ -1297,4 +1362,67 @@ public void exportResourcesToGcs(ProcessContext context)
       }
     }
   }
+
+  /** Deidentify FHIR resources from a FHIR store to a destination FHIR store 
*/
+  public static class Deidentify extends PTransform<PBegin, 
PCollection<String>> {
+    private final ValueProvider<String> sourceFhirStore;
+    private final ValueProvider<String> destinationFhirStore;
+    private final ValueProvider<DeidentifyConfig> deidConfig;
+
+    public Deidentify(
+        ValueProvider<String> sourceFhirStore,
+        ValueProvider<String> destinationFhirStore,
+        ValueProvider<DeidentifyConfig> deidConfig) {
+      this.sourceFhirStore = sourceFhirStore;
+      this.destinationFhirStore = destinationFhirStore;
+      this.deidConfig = deidConfig;
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      return input
+          .getPipeline()
+          .apply(Create.ofProvider(sourceFhirStore, StringUtf8Coder.of()))
+          .apply(
+              "ScheduleDeidentifyFhirStoreOperations",
+              ParDo.of(new DeidentifyFn(destinationFhirStore, deidConfig)));
+    }
+
+    /** A function that schedules a deidentify operation and monitors the 
status. */
+    public static class DeidentifyFn extends DoFn<String, String> {
+
+      private HealthcareApiClient client;
+      private final ValueProvider<String> destinationFhirStore;
+      private final String deidConfigJson;
+
+      public DeidentifyFn(
+          ValueProvider<String> destinationFhirStore, 
ValueProvider<DeidentifyConfig> deidConfig) {
+        this.destinationFhirStore = destinationFhirStore;
+        Gson g = new Gson();
+        this.deidConfigJson = g.toJson(deidConfig.get());

Review comment:
       I see, that makes sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to