[
https://issues.apache.org/jira/browse/BIGTOP-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972230#comment-13972230
]
jay vyas commented on BIGTOP-1275:
----------------------------------
You can see the architecture at this link (pasted it into erdos) :
[https://chart.googleapis.com/chart?chl=digraph+ethane+{%0D%0A%0D%0A+++node+[shape%3Drecord]%3B%0D%0A%0D%0A+++PIG_ANALYTICS+[label%3D%22PIG_ANALYTICS|Unstructured-unsupported-pigscripts|+pig_ad_hoc%280-n%29%22]%3B%0D%0A%0D%0A+++CUSTOMER_PAGE+[label%3D%22CUSTOMER_PAGE|json|CUSTOMER_PAGE%2Fpart*%22]%3B%0D%0A+++DIRTY_CSV+[label%3D%22DIRTY_CSV|fname+++lname+-prod+%2C+price+%2Cprod%2C..|generated%2Fpart*%22]%3B%0D%0A+++CSV+[label%3D%22CSV|fname%2Clname%2Cprod%2Cprice%2Cdate%2Cxcoord%2Cycoord%2C...|cleaned%2Fpart*%22]%3B%0D%0A+++MAHOUT_VIEW_INPUT+[label%3D%22MAHOUT_VIEW++|++%28hashed+name%29+10001%2C+%28hashed+purchases%29+203+|++%3Chive_warehouse%3E%2Fmahout_cf_in%2Fpart*%22+]%3B%0D%0A+++MAHOUT_CF+[label%3D%22MAHOUT_CF++|+%28hashed+name%29+10001%2C+%28hashed+product%29+201%2C+.6+|+mahout_cf_out%2Fpart*%22+]%3B%0D%0A+%0D%0A+++Generate+-%3E+DIRTY_CSV+[label%3D%22hadoop+jar+bigpetstore.jar+org.bigtop.bigpetstore.generator.BPSGenerator+100+bps%2Fgenerated%2F%22]+%3B%0D%0A+++DIRTY_CSV+-%3E+pig+[label%3D%22%22]%3B++%0D%0A+++%0D%0A+++pig+-%3E+CSV+[label%3D%22hadoop+jar+bigpetstore.jar+org.bigtop.bigpetstore.etl.PigCSVCleaner+bps%2Fgenerated%2F+bps%2Fcleaned%2F%22]%3B%0D%0A+++pig+-%3E+PIG_ANALYTICS+[label%3D%22same+as+CSV+job%2C+but+add+your+scripts+to+end...+p1.pig+p2.pig+...%22]%3B%0D%0A+++PIG_ANALYTICS+-%3E+CSV%3B%0D%0A+++PROD_HASH+-%3E+hive+[label%3D%22hive+hash+udf%22]%3B%0D%0A+++USER_HASH+-%3E+hive++[label%3D%22hive+hash+udf%22]%3B%0D%0A+++%0D%0A+++CSV+-%3E+hive+%3B+%0D%0A+++hive+-%3E+MAHOUT_VIEW_INPUT+[label%3D%22hadoop+jar+bigpetstore.jar+org.bigtop.bigpetstore.etl.HiveViewCreator+bps%2Fpig_out+mahout_cf_in%22]%3B++++++++++%0D%0A+++MAHOUT_VIEW_INPUT+-%3E+mahout_collab_filter_recomender++-%3E+MAHOUT_CF%3B%0D%0A+++MAHOUT_CF++-%3E+crunch+%3B%0D%0A+++CSV+-%3E+crunch+%3B+%0D%0A+++crunch+-%3E+CUSTOMER_PAGE+[label%3D%22high+performance+joining%22]%3B%0D%0A%0D%0A}&cht=gv]
> BigPetStore: Add all 50 states
> ------------------------------
>
> Key: BIGTOP-1275
> URL: https://issues.apache.org/jira/browse/BIGTOP-1275
> Project: Bigtop
> Issue Type: Improvement
> Components: Blueprints
> Affects Versions: backlog
> Reporter: jay vyas
>
> Jeff Dutton at OrangeFS Has created a pull request to generate synthetic data
> for all 50 states in the union.
> https://github.com/acberk/bigpetstore/blob/903f62fe04b02962a4f97c6f4c9717cd78af1989/src/main/java/org/bigtop/bigpetstore/generator/TransactionIteratorFactory.java
> Shall we role this in as a patch ? Maybe we can do so in a little more
> sophisticated way (i.e. have the states in a configuration file - this 200
> line enum is getting pretty large).
--
This message was sent by Atlassian JIRA
(v6.2#6252)