[
https://issues.apache.org/jira/browse/BIGTOP-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jay vyas updated BIGTOP-1536:
-----------------------------
Attachment: firstpass.patch
just a quick patch that adds a single query, total transaction # and updates
tests to run both generator, etl, and the new BigPetStoreStatistics class.
Obviously ill polish it some more shortly.
For now, I don't know how to serialize the saved files from the ETL job, which
are plain text , in the format :
{noformat}
ArrayBuffer(6,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@456f43c1,Texas
City,TX,0,Junpei Snashall,07201,Elizabeth,NJ,2,Fri May 08 02:15:38 PDT
2015,category=dry dog food;brand=Dog
Days;flavor=Chicken;size=30.0;per_unit_cost=3.0;,
6,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@456f43c1,Texas
City,TX,0,Junpei Snashall,07201,Elizabeth,NJ,2,Fri May 08 02:15:38 PDT
2015,category=poop bags;brand=Dog
Days;color=Blue;size=120.0;per_unit_cost=0.21;)
ArrayBuffer(7,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@48f5c8f2,Greenville,NC,0,Junpei
Snashall,07201,Elizabeth,NJ,1,Mon Mar 30 06:38:32 PDT 2015,category=poop
bags;brand=Happy Pup;color=multicolor;size=60.0;per_unit_cost=0.17;)
ArrayBuffer(3,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@3febff9e,Crawford,CO,0,Junpei
Snashall,07201,Elizabeth,NJ,0,Tue Feb 03 14:57:38 PST 2015,category=poop
bags;brand=Happy Pup;color=multicolor;size=60.0;per_unit_cost=0.17;,
3,com.github.rnowling.bps.datagenerator.datamodels.inputs.ZipcodeRecord@3febff9e,Crawford,CO,0,Junpei
Snashall,07201,Elizabeth,NJ,0,Tue Feb 03 14:57:38 PST 2015,category=dry dog
food;brand=Dog Days;flavor=Fish & Potato;size=30.0;per_unit_cost=3.0;)
{noformat}
Any thoughts on how to read those in ?
{noformat}
def run(transactionsInputDir:String, sc:SparkContext): Boolean = {
System.out.println("input : " + transactionsInputDir);
val t=totalTransactions(sc.textFile(transactionsInputDir,10), sc);
System.out.println("Transaction count = " + t);
sc.stop()
true;
}
{noformat}
Right now, thats all im doing (objectFile fails since its not an actual
SequenceFile.)
> Add Basic Sales Analytics Example to BPS Spark
> ----------------------------------------------
>
> Key: BIGTOP-1536
> URL: https://issues.apache.org/jira/browse/BIGTOP-1536
> Project: Bigtop
> Issue Type: Improvement
> Components: blueprints
> Reporter: RJ Nowling
> Assignee: RJ Nowling
> Attachments: firstpass.patch
>
>
> Using the Spark data generator and ETL script (BIGTOP-1535), add a simple
> Spark sales analytics example that computes basic stats such as:
> * Number of sales per category per month or quarter
> * Top selling items in each category per month or quarter
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)