[ 
https://issues.apache.org/jira/browse/PHOENIX-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-471:
---------------------------------
    Assignee:     (was: maghamravikiran)

> Sensor network end to end integration test case
> -----------------------------------------------
>
>                 Key: PHOENIX-471
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-471
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: Andrew Purtell
>
> MERLSense Data
> -----
> Mitsubishi Electric Research Labs (MERL) collected motion sensor data from a 
> network of over 200 sensors in the research facility for two years and then 
> released a public data set ("MERLSense Data") in 2009. The data set contains 
> over 50 million raw motion records and is distributed as a GZIP compressed 
> tarball approximately 1.1 GB in size. See http://go.drwren.com/wmddata . The 
> data set is described in a technical report available at 
> http://www.merl.com/publications/docs/TR2007-069.pdf.
> The MERLSense data contains spatio-temporal structure at the granularity of 
> seconds of individuals walking down hallways, chatting with colleagues, 
> attending talks and meetings, on weekdays and weekends through varying 
> seasons and weather.
> This data has some nice properties for testing Phoenix's join capabilities 
> and also the current secondary index implementation, key being that row data 
> is immutable since it is a record of time series data.
> The raw motion trace files look like:
>     470 01179980510828 01179980511853 1.0
>     469 01179980512169 01179980513193 1.0
>     467 01179980513580 01179980514609 1.0
>     468 01179980514573 01179980515598 1.0
> The first element is the sensor identification number. The second and third 
> numbers are the timestamps of the beginning and end of the event, 
> respectively. Timestamps are the number of milliseconds since the epoch 
> (January 1, 1970 UTC). Take care to use 64-bit integer representations when 
> manipulating timestamps. The fourth number is the magnitude of the sensor 
> reading, always 1.0. 
> The dataset includes a calibration file that associates the sensor IDs to a 
> map of the lab. Each sensor ID corresponds to a unique sensor.
>     sid,floor,wing
>     214,8,L
>     222,8,L
>     256,8,W
>     257,8,W
> The sensors IDs are associated with physical space by a table that contains 
> one row per sensor, keyed by the sensor ID, with eight coordinates that 
> specify the four corners of a quadrilateral in meters:
>     sid,x1,y1,x2,y2,x3,y3,x4,y4
>     214,-13.3,23.1,-13.3,25.3,-15.5,25.3,-15.5,23.1
>     222,-13.3,20.9,-13.3,23.1,-15.5,23.1,-15.5,20.9
>     256,-15.5,8.3,-15.5,10.5,-17.7,10.5,-17.7,8.3
>     257,-13.3,8.3,-13.3,10.5,-15.5,10.5,-15.5,8.3
> In addition the data is given temporal meaning by several calendars that 
> record the times and locations of various meetings and gatherings, the dates 
> of official holidays, and a record of the number of people who were out of 
> the office on given days.
> A daily almanac of the weather conditions as measured at nearby Boston Logan 
> airport is also provided.
> Phoenix Integration Tests using MERLSense Data
> -----
> 0. Create the observation table.
>         CREATE TABLE observations (
>             sensor_id INTEGER NOT NULL,
>             start_time BIGINT(20) NOT NULL,
>             end_time BIGINT(20) NOT NULL
>             CONSTRAINT pk PRIMARY KEY ( sensor_id, start_time )
>         )
>         IMMUTABLE_ROWS=true
> 1. Create indexes for motion events in each sensor by time in descending order
>         CREATE INDEX observations_${sensor} ON observations ( start_time DESC 
> )
> 2. a. Replay or bulk insert the motion sensor data.
>         UPSERT INTO observations (sensor_id, start_time, end_time) VALUES 
> (...)
>    b. Generate realistic additional paths of "individuals" for upsert into 
> the observations table at the desired rate. Choose between a short and long 
> walk, with short being more likely. Then perform a random walk of the chosen 
> number of transitions with variable delay at each step. A transition is valid 
> only if a sensor's area is reacheable from its predecessor's according to 
> sensor coverage geography. There is a helpful map provided with the data 
> showing sensor adjacencies on a floor plan.
> 3. Find TopN popular locations using the main observations table. 
> 4. Select subsets of activity to study as joins over indexes.
> 5. Join motion sensor data with the sensor location table to produce result 
> sets with spatial context. The sensor location table is very small so this 
> should be possible to do in memory on the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to