Hi James,
I have started exploring the option of building the scan operator for the
DynamoDB.
I will let you know, how is it going.
Thanks
Rai
On Fri, Oct 14, 2016 at 11:42 AM, James Wing <jvw...@gmail.com> wrote:
> Correct, but I'm afraid I'm no expert on DynamoDB. It is my understanding
> that you have to iterate through the keys in the source table one-by-one,
> then put each key's content into the destination table. You can speed this
> up by using multiple iterators, each covering a distinct portion of the key
> range.
>
> Amazon does provide tools as part of AWS Data Pipeline that might help
> automate this, and if all you want is an identical export and import, that
> is probably easier than NiFi. But I believe the underlying process is very
> similar, just that Amazon using an ElasticMapReduce cluster instead of
> NiFi. A key point being that the export and import operations count
> against your provisioned throughput, Amazon provides no shortcut around
> paying for the I/O. But this might work now, today, without any custom
> code.
>
> Cross-Region Export and Import of DynamoDB Tables
> https://aws.amazon.com/blogs/aws/cross-region-import-and-
> export-of-dynamodb-tables/
>
> AWS Data Pipeline - Export DynamoDB Table to S3
> http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-
> exportddbtos3.html
>
> AWS Data Pipeline - Import DynamoDB Backup Data from S3
> http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-
> exports3toddb.html
>
> Thanks,
>
> James
>
>
> On Fri, Oct 14, 2016 at 10:58 AM, Gop Krr <gop@gmail.com> wrote:
>
>> Thanks James. I would be happy to contribute the scan processor for
>> DynamoDB. Just to clarify, based on your comment, we can't take all the
>> rows of the DynamoDB table and put it into another table. We have to do it
>> for one record at a time?
>>
>> On Fri, Oct 14, 2016 at 10:50 AM, James Wing <jvw...@gmail.com> wrote:
>>
>>> NiFi's GetDynamoDB processor uses the underlying BatchGetItem API, which
>>> requires item keys as inputs. Iterating over the keys in a table would
>>> require the Scan API, but NiFi does not have a processor to scan a DynamoDB
>>> table.
>>>
>>> This would be a great addition to NiFi. If you have any interest in
>>> working on a scan processor, please open a JIRA ticket at
>>> https://issues.apache.org/jira/browse/NIFI.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>> On Thu, Oct 13, 2016 at 2:12 PM, Gop Krr <gop@gmail.com> wrote:
>>>
>>>> Thanks James. I am looking to iterate through the table so that it
>>>> takes hash key values one by one. Do I achieve it through the expression
>>>> language? if I write an script to do that, how do I pass it to my
>>>> processor?
>>>> Thanks
>>>> Niraj
>>>>
>>>> On Thu, Oct 13, 2016 at 1:42 PM, James Wing <jvw...@gmail.com> wrote:
>>>>
>>>>> Rai,
>>>>>
>>>>> The GetDynamoDB processor requires a hash key value to look up an item
>>>>> in the table. The default setting is an Expression Language statement
>>>>> that
>>>>> reads the hash key value from a flowfile attribute,
>>>>> dynamodb.item.hash.key.value. But this is not required. You can change
>>>>> it
>>>>> to any attribute expression ${my.hash.key}, or even hard-code a single key
>>>>> "item123" if you wish.
>>>>>
>>>>> Does that help?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> James
>>>>>
>>>>> On Thu, Oct 13, 2016 at 12:17 PM, Gop Krr <gop@gmail.com> wrote:
>>>>>
>>>>>> Hi All,
>>>>>> I have been trying to use get and load processor for the dynamodb and
>>>>>> I am almost there. I am able to run the get processor and I see, data is
>>>>>> flowing :)
>>>>>> But I see the following error in my nifi-app.log file:
>>>>>>
>>>>>> 2016-10-13 18:02:38,823 ERROR [Timer-Driven Process Thread-9]
>>>>>> o.a.n.p.aws.dynamodb.GetDynamoDB
>>>>>> GetDynamoDB[id=7d906337-0157-1000-5868-479d0e0e3580]
>>>>>> Hash key value '' is required for flow file
>>>>>> StandardFlowFileRecord[uuid=44
>>>>>> 554c23-1618-47db-b46e-04ffd737748e,claim=StandardContentClaim
>>>>>> [resourceClaim=StandardResourceClaim[id=1476381755460-37287,
>>>>>> container=default, section=423], offset=0, length=1048576],offset=0,name=
>>>>>> 2503473718684086,size=1048576]
>>>>>>
>>>>>>
>>>>>> I understand that, its looking for the Hash Key Value but I am not
>>>>>> sure, how do I pass it. In the setting tab, nifi automatically populates
>>>>>> this: ${dynamodb.item.hash.key.value} but looks like this is not the
>>>>>> right way to do it. Can I get some guidance on this? Thanks for all the
>>>>>> help.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Rai
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>