[ 
https://issues.apache.org/jira/browse/BIGTOP-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay vyas updated BIGTOP-1536:
-----------------------------
    Attachment: firstpass.patch

just a quick patch that adds a single query, total transaction # and updates 
tests to run both generator, etl, and the new BigPetStoreStatistics class.

Obviously ill polish it some more shortly.

For now, I don't know how to serialize the saved files from the ETL job, which 
are plain text , in the format : 

{noformat}

ArrayBuffer(6,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@456f43c1,Texas
 City,TX,0,Junpei Snashall,07201,Elizabeth,NJ,2,Fri May 08 02:15:38 PDT 
2015,category=dry dog food;brand=Dog 
Days;flavor=Chicken;size=30.0;per_unit_cost=3.0;, 
6,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@456f43c1,Texas
 City,TX,0,Junpei Snashall,07201,Elizabeth,NJ,2,Fri May 08 02:15:38 PDT 
2015,category=poop bags;brand=Dog 
Days;color=Blue;size=120.0;per_unit_cost=0.21;)
ArrayBuffer(7,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@48f5c8f2,Greenville,NC,0,Junpei
 Snashall,07201,Elizabeth,NJ,1,Mon Mar 30 06:38:32 PDT 2015,category=poop 
bags;brand=Happy Pup;color=multicolor;size=60.0;per_unit_cost=0.17;)
ArrayBuffer(3,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@3febff9e,Crawford,CO,0,Junpei
 Snashall,07201,Elizabeth,NJ,0,Tue Feb 03 14:57:38 PST 2015,category=poop 
bags;brand=Happy Pup;color=multicolor;size=60.0;per_unit_cost=0.17;, 
3,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@3febff9e,Crawford,CO,0,Junpei
 Snashall,07201,Elizabeth,NJ,0,Tue Feb 03 14:57:38 PST 2015,category=dry dog 
food;brand=Dog Days;flavor=Fish & Potato;size=30.0;per_unit_cost=3.0;)
{noformat}

Any thoughts on how to read those in ? 

{noformat}
    def run(transactionsInputDir:String, sc:SparkContext): Boolean = {
      System.out.println("input : " + transactionsInputDir);
      val t=totalTransactions(sc.textFile(transactionsInputDir,10), sc);
      System.out.println("Transaction count = " + t);
      sc.stop()
      true;
 }
{noformat}

Right now, thats all im doing (objectFile fails since its not an actual 
SequenceFile.)

> Add Basic Sales Analytics Example to BPS Spark
> ----------------------------------------------
>
>                 Key: BIGTOP-1536
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1536
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: blueprints
>            Reporter: RJ Nowling
>            Assignee: RJ Nowling
>         Attachments: firstpass.patch
>
>
> Using the Spark data generator and ETL script (BIGTOP-1535), add a simple 
> Spark sales analytics example that computes basic stats such as:
> * Number of sales per category per month or quarter
> * Top selling items in each category per month or quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to