Hi James, I have started exploring the option of building the scan operator for the DynamoDB. I will let you know, how is it going. Thanks Rai
On Fri, Oct 14, 2016 at 11:42 AM, James Wing <[email protected]> wrote: > Correct, but I'm afraid I'm no expert on DynamoDB. It is my understanding > that you have to iterate through the keys in the source table one-by-one, > then put each key's content into the destination table. You can speed this > up by using multiple iterators, each covering a distinct portion of the key > range. > > Amazon does provide tools as part of AWS Data Pipeline that might help > automate this, and if all you want is an identical export and import, that > is probably easier than NiFi. But I believe the underlying process is very > similar, just that Amazon using an ElasticMapReduce cluster instead of > NiFi. A key point being that the export and import operations count > against your provisioned throughput, Amazon provides no shortcut around > paying for the I/O. But this might work now, today, without any custom > code. > > Cross-Region Export and Import of DynamoDB Tables > https://aws.amazon.com/blogs/aws/cross-region-import-and- > export-of-dynamodb-tables/ > > AWS Data Pipeline - Export DynamoDB Table to S3 > http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template- > exportddbtos3.html > > AWS Data Pipeline - Import DynamoDB Backup Data from S3 > http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template- > exports3toddb.html > > Thanks, > > James > > > On Fri, Oct 14, 2016 at 10:58 AM, Gop Krr <[email protected]> wrote: > >> Thanks James. I would be happy to contribute the scan processor for >> DynamoDB. Just to clarify, based on your comment, we can't take all the >> rows of the DynamoDB table and put it into another table. We have to do it >> for one record at a time? >> >> On Fri, Oct 14, 2016 at 10:50 AM, James Wing <[email protected]> wrote: >> >>> NiFi's GetDynamoDB processor uses the underlying BatchGetItem API, which >>> requires item keys as inputs. Iterating over the keys in a table would >>> require the Scan API, but NiFi does not have a processor to scan a DynamoDB >>> table. >>> >>> This would be a great addition to NiFi. If you have any interest in >>> working on a scan processor, please open a JIRA ticket at >>> https://issues.apache.org/jira/browse/NIFI. >>> >>> Thanks, >>> >>> James >>> >>> On Thu, Oct 13, 2016 at 2:12 PM, Gop Krr <[email protected]> wrote: >>> >>>> Thanks James. I am looking to iterate through the table so that it >>>> takes hash key values one by one. Do I achieve it through the expression >>>> language? if I write an script to do that, how do I pass it to my >>>> processor? >>>> Thanks >>>> Niraj >>>> >>>> On Thu, Oct 13, 2016 at 1:42 PM, James Wing <[email protected]> wrote: >>>> >>>>> Rai, >>>>> >>>>> The GetDynamoDB processor requires a hash key value to look up an item >>>>> in the table. The default setting is an Expression Language statement >>>>> that >>>>> reads the hash key value from a flowfile attribute, >>>>> dynamodb.item.hash.key.value. But this is not required. You can change >>>>> it >>>>> to any attribute expression ${my.hash.key}, or even hard-code a single key >>>>> "item123" if you wish. >>>>> >>>>> Does that help? >>>>> >>>>> Thanks, >>>>> >>>>> James >>>>> >>>>> On Thu, Oct 13, 2016 at 12:17 PM, Gop Krr <[email protected]> wrote: >>>>> >>>>>> Hi All, >>>>>> I have been trying to use get and load processor for the dynamodb and >>>>>> I am almost there. I am able to run the get processor and I see, data is >>>>>> flowing :) >>>>>> But I see the following error in my nifi-app.log file: >>>>>> >>>>>> 2016-10-13 18:02:38,823 ERROR [Timer-Driven Process Thread-9] >>>>>> o.a.n.p.aws.dynamodb.GetDynamoDB >>>>>> GetDynamoDB[id=7d906337-0157-1000-5868-479d0e0e3580] >>>>>> Hash key value '' is required for flow file >>>>>> StandardFlowFileRecord[uuid=44 >>>>>> 554c23-1618-47db-b46e-04ffd737748e,claim=StandardContentClaim >>>>>> [resourceClaim=StandardResourceClaim[id=1476381755460-37287, >>>>>> container=default, section=423], offset=0, length=1048576],offset=0,name= >>>>>> 2503473718684086,size=1048576] >>>>>> >>>>>> >>>>>> I understand that, its looking for the Hash Key Value but I am not >>>>>> sure, how do I pass it. In the setting tab, nifi automatically populates >>>>>> this: ${dynamodb.item.hash.key.value} but looks like this is not the >>>>>> right way to do it. Can I get some guidance on this? Thanks for all the >>>>>> help. >>>>>> >>>>>> Best, >>>>>> >>>>>> Rai >>>>>> >>>>> >>>>> >>>> >>> >> >
