[
https://issues.apache.org/jira/browse/PHOENIX-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Taylor updated PHOENIX-471:
---------------------------------
Assignee: (was: maghamravikiran)
> Sensor network end to end integration test case
> -----------------------------------------------
>
> Key: PHOENIX-471
> URL: https://issues.apache.org/jira/browse/PHOENIX-471
> Project: Phoenix
> Issue Type: Task
> Reporter: Andrew Purtell
>
> MERLSense Data
> -----
> Mitsubishi Electric Research Labs (MERL) collected motion sensor data from a
> network of over 200 sensors in the research facility for two years and then
> released a public data set ("MERLSense Data") in 2009. The data set contains
> over 50 million raw motion records and is distributed as a GZIP compressed
> tarball approximately 1.1 GB in size. See http://go.drwren.com/wmddata . The
> data set is described in a technical report available at
> http://www.merl.com/publications/docs/TR2007-069.pdf.
> The MERLSense data contains spatio-temporal structure at the granularity of
> seconds of individuals walking down hallways, chatting with colleagues,
> attending talks and meetings, on weekdays and weekends through varying
> seasons and weather.
> This data has some nice properties for testing Phoenix's join capabilities
> and also the current secondary index implementation, key being that row data
> is immutable since it is a record of time series data.
> The raw motion trace files look like:
> 470 01179980510828 01179980511853 1.0
> 469 01179980512169 01179980513193 1.0
> 467 01179980513580 01179980514609 1.0
> 468 01179980514573 01179980515598 1.0
> The first element is the sensor identification number. The second and third
> numbers are the timestamps of the beginning and end of the event,
> respectively. Timestamps are the number of milliseconds since the epoch
> (January 1, 1970 UTC). Take care to use 64-bit integer representations when
> manipulating timestamps. The fourth number is the magnitude of the sensor
> reading, always 1.0.
> The dataset includes a calibration file that associates the sensor IDs to a
> map of the lab. Each sensor ID corresponds to a unique sensor.
> sid,floor,wing
> 214,8,L
> 222,8,L
> 256,8,W
> 257,8,W
> The sensors IDs are associated with physical space by a table that contains
> one row per sensor, keyed by the sensor ID, with eight coordinates that
> specify the four corners of a quadrilateral in meters:
> sid,x1,y1,x2,y2,x3,y3,x4,y4
> 214,-13.3,23.1,-13.3,25.3,-15.5,25.3,-15.5,23.1
> 222,-13.3,20.9,-13.3,23.1,-15.5,23.1,-15.5,20.9
> 256,-15.5,8.3,-15.5,10.5,-17.7,10.5,-17.7,8.3
> 257,-13.3,8.3,-13.3,10.5,-15.5,10.5,-15.5,8.3
> In addition the data is given temporal meaning by several calendars that
> record the times and locations of various meetings and gatherings, the dates
> of official holidays, and a record of the number of people who were out of
> the office on given days.
> A daily almanac of the weather conditions as measured at nearby Boston Logan
> airport is also provided.
> Phoenix Integration Tests using MERLSense Data
> -----
> 0. Create the observation table.
> CREATE TABLE observations (
> sensor_id INTEGER NOT NULL,
> start_time BIGINT(20) NOT NULL,
> end_time BIGINT(20) NOT NULL
> CONSTRAINT pk PRIMARY KEY ( sensor_id, start_time )
> )
> IMMUTABLE_ROWS=true
> 1. Create indexes for motion events in each sensor by time in descending order
> CREATE INDEX observations_${sensor} ON observations ( start_time DESC
> )
> 2. a. Replay or bulk insert the motion sensor data.
> UPSERT INTO observations (sensor_id, start_time, end_time) VALUES
> (...)
> b. Generate realistic additional paths of "individuals" for upsert into
> the observations table at the desired rate. Choose between a short and long
> walk, with short being more likely. Then perform a random walk of the chosen
> number of transitions with variable delay at each step. A transition is valid
> only if a sensor's area is reacheable from its predecessor's according to
> sensor coverage geography. There is a helpful map provided with the data
> showing sensor adjacencies on a floor plan.
> 3. Find TopN popular locations using the main observations table.
> 4. Select subsets of activity to study as joins over indexes.
> 5. Join motion sensor data with the sensor location table to produce result
> sets with spatial context. The sensor location table is very small so this
> should be possible to do in memory on the server.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)