[ 
https://issues.apache.org/jira/browse/BIGTOP-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323607#comment-14323607
 ] 

RJ Nowling commented on BIGTOP-1653:
------------------------------------

[~jayunit100] neat work!  It's great to have an example for how to do SQL 
queries in Spark.

The following needs to be fixed before we can commit the patch:

1. Change package in PetStoreStatistics.scala from generator to analytics

2. Remove {noformat}:Any = {noformat} from the main method of 
PetStoreStatistics so that it can be executed directly.  Related compiler 
warning:

{noformat}
/home/rnowling/Projects/bigtop_1653/bigtop-bigpetstore/bigpetstore-spark/src/main/scala/org/apache/bigpetstore/spark/analytics/PetStoreStatistics.scala:40:
 PetStoreStatistics has a main method with parameter type Array[String], but 
org.apache.bigtop.bigpetstore.spark.generator.PetStoreStatistics will not be a 
runnable program.
  Reason: main method must have exact signature (Array[String])Unit
object PetStoreStatistics {
{noformat}

3. Change app name from "BPS Data Generator" to something like 
"PetStoreStatistics" in main method of PetStoreStatistics

4. Parameters should be checked before creating a SparkContext, otherwise the 
output is hard to read.

5. Usage isn't printed when the wrong number of parameters are given.  

6. There are stray semicolons in the source code.  Scala doesn't need 
semicolons so I suggest grepping the files for all instances and deleting them.

7. Comment on lines 65-69 in PetStoreStatistics seems to be leftover from 
Generator code.

8. Move import of sql context on line 74 of PetStoreStatistics to top of file

9. What are lines 86-91 of PetStoreStatistics?  I see comments saying that the 
lines shouldn't be used except for testing and a commented out line.

10. Instead of just writing out total transactions, why not write out the 
number of transactions by month?

11. Can you add a meaningful variable name for the result of SQL query 2 like 
you did with SQL query 1 instead of just calling collect() in the return 
statement?

12. Please add a section to the README with instructions for running 
PetStoreStatistics from the CLI.



> Add queries for customer, state, and product statistics w/ d3 friendly JSON 
> output to analytics phase. 
> -------------------------------------------------------------------------------------------------------
>
>                 Key: BIGTOP-1653
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1653
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: blueprints
>    Affects Versions: 0.9.0
>            Reporter: jay vyas
>            Assignee: jay vyas
>             Fix For: 0.9.0
>
>         Attachments: BIGTOP-1653.patch, BIGTOP-1653.patch, BIGTOP-1653.patch
>
>
> Follow on to BIGTOP-1536, this time we can use a scala json library if a good 
> one exists. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to