[
https://issues.apache.org/jira/browse/BEAM-12400?focusedWorklogId=612121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612121
]
ASF GitHub Bot logged work on BEAM-12400:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 18/Jun/21 21:22
Start Date: 18/Jun/21 21:22
Worklog Time Spent: 10m
Work Description: pabloem commented on a change in pull request #14927:
URL: https://github.com/apache/beam/pull/14927#discussion_r653794490
##########
File path:
sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
##########
@@ -106,6 +109,28 @@
* .withNumSplits(30))
*
* }</pre>
+ *
+ * *
+ *
+ * <p>To configure a MongoDB sink and update, you must specify a connection
{@code URI}, a {@code
+ * Database} * name, a {@code Collection} name. It matches the key with _id in
target collection.
+ * For instance: * *
+ *
+ * <pre>{@code
+ * * pipeline
+ * * .apply(...)
+ * * .apply(MongoDbIO.write()
+ * * .withUri("mongodb://localhost:27017")
+ * * .withDatabase("my-database")
+ * * .withCollection("my-collection")
+ * * .withIsUpdate(true)
+ * * .withUpdateKey("key-to-match")
+ * * .withUpdateField("field-to-update")
+ * * .withUpdateOperator("$set")
+ * * .withNumSplits(30))
Review comment:
my guess is that maybe we have an document like this in the database:
```
{
"name": "ironman",
"status": "active",
"age": 55,
"location": "malibu",
"lastUpdated": "(SOME TIMESTAMP)"
}
```
And perhaps the PCollection that comes as input contains the following
element:
```
{
"name": "ironman",
"status": "inactive",
"age": 56,
"location": "malibu"
}
```
In this case, we could do:
```
.withUpdateKey("name")
.withUpdateFields(
UpdateField.of("$set", "status"),
UpdateField.of("$currentDate", "lastUpdated"),
UpdateField.of("$set", "age")))
```
Does that make sense? Another thing I wonder about is - what if we just want
to update all the fields without having to list them all one by one on the
`updateFields` attribute of the transform.
thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 612121)
Remaining Estimate: 167h (was: 167h 10m)
Time Spent: 1h (was: 50m)
> Improve MongoDBIO for beam - add update capability
> --------------------------------------------------
>
> Key: BEAM-12400
> URL: https://issues.apache.org/jira/browse/BEAM-12400
> Project: Beam
> Issue Type: Improvement
> Components: io-java-mongodb
> Reporter: Paresh Saraf
> Assignee: Paresh Saraf
> Priority: P2
> Original Estimate: 168h
> Time Spent: 1h
> Remaining Estimate: 167h
>
> Right now mongodbio supports only inserts/overwrites to a collections. In
> many cases it will be usually updating an existing document: setting a field
> or pushing into an array. BulkUpdate capability to be added as part part of
> MongoDBIo->Write
--
This message was sent by Atlassian Jira
(v8.3.4#803005)