sliding Top N window
Good day, I have a following task: a stream of “page vies” coming to kafka topic. Each view contains list of product Ids from a visited page. The task: to have in “real time” Top N product. I am interested in some solution that would require minimum intermediate writes … So need to build a sliding window for top N product, where the product counters dynamically changes and window should present the TOP product for the specified period of time. I believe there is no way to avoid maintaining all product counters counters in memory/storage. But at least I would like to do all logic, all calculation on a fly, in memory, not spilling multiple RDD from memory to disk. So I believe I see one way of doing it: Take, msg from kafka take and line up, all elementary action (increase by 1 the counter for the product PID ) Each action will be implemented as a call to HTable.increment() // or easier, with incrementColumnValue()… After each increment I can apply my own operation “offer” would provide that only top N products with counters are kept in another Hbase table (also with atomic operations). But there is another stream of events: decreasing product counters when view expires the legth of sliding window…. So my question: does anybody know/have and can share the piece code/ know how: how to implement “sliding Top N window” better. If nothing will be offered, I will share what I will do myself. Thank you Alexey This message, including any attachments, is the property of Sears Holdings Corporation and/or one of its subsidiaries. It is confidential and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it without reading the contents. Thank you.
Unsupported major.minor version 51.0
I found some discussions online, but it all cpome to advice to use JDF 1.7 (or 1.8). Well, I use JDK 1.7 on OS X Yosemite . Both java –verion // java version 1.7.0_80 Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) and echo $JAVA_HOME// /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home show JDK 1.7. But for the Spark 1.4.1. (and for Spark 1.2.2, downloaded 07/10/2015, I have the same error when build with maven () (as sudo mvn -DskipTests -X clean package abra.txt) Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/maven/cli/MavenCli : Unsupported major.minor version 51.0 Please help how to build the thing. Thanks Alexey This message, including any attachments, is the property of Sears Holdings Corporation and/or one of its subsidiaries. It is confidential and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it without reading the contents. Thank you.
Can't build Spark 1.3
\ I downloaded the latest Spark (1.3.) from github. Then I tried to build it. First for scala 2.10 (and hadoop 2.4): build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package That resulted in hangup after printing bunch of line like [INFO] Dependency-reduced POM written at …… INFO] Dependency-reduced - Then I tried for scala 2.11 mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package That resulted in multiple compilation errors. What I actually want is: mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests clean package Is it only me, who can’t build Spark 1.3? And, is there any site to download Spark prebuilt for Hadoop 2.5 and Hive? Thank you for any help. Alexey This message, including any attachments, is the property of Sears Holdings Corporation and/or one of its subsidiaries. It is confidential and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it without reading the contents. Thank you.