[ https://issues.apache.org/jira/browse/HDFS-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156007#comment-16156007 ]
Anu Engineer commented on HDFS-12213: ------------------------------------- tagging this as an "ozone merge" work item, since having an online tool makes the ozone system work against some real world data. This is not a must *do* work item before merge since the offline tool already provides enough workload. So let us shoot for the best effort basis for this. > Ozone: Corona: Support for online mode > -------------------------------------- > > Key: HDFS-12213 > URL: https://issues.apache.org/jira/browse/HDFS-12213 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Reporter: Nandakumar > Assignee: Nandakumar > Labels: ozoneMerge, tool > > This jira brings support for online mode in corona. > In online mode, common crawl data from AWS will be used to populate ozone > with data. Default source is [CC-MAIN-2017-17/warc.paths.gz | > https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2017-17/warc.paths.gz] > (it contains the path to actual data segment), user can override this using > -source. > The following values are derived from URL of Common Crawl data > * Domain will be used as Volume > * URL will be used as Bucket > * FileName will be used as Key -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org