[ 
https://issues.apache.org/jira/browse/MADLIB-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1002:
------------------------------------
    Description: 
Story

As a data scientist, I want to perform session reconstruction on my data set, 
so that I can prepare for input into other algorithms like path functions, or 
predictive analytics algorithms.

This is a follow on to 
https://issues.apache.org/jira/browse/MADLIB-909
https://issues.apache.org/jira/browse/MADLIB-1001
to add minimum time.

Details 

Add min time to the existing params:

{code}
‘output_all_cols = <value>, -- existing param
 create_view = <value>,  -- existing param
 min_time = <value>'
{code} 

Parameters

min_time (optional)
FLOAT  Minimum delta time that must elapse between events in order for this 
session to be considered a real (human) session (default=0).  User should 
specify in the same units as the time_out parameter.   Usage:  if an event 
happens less than min_time, do not include the event in the current session.  
When you look at the next event, compare against the last valid session event, 
not against the one you just dropped.

For an example of how min_time could work, see Aster Analytics sessionization 
function [1].

References

[4] Aster Analytics users guide, see "sessionize" function
http://www.info.teradata.com/edownload.cfm?itemid=143450001
http://www.info.teradata.com/templates/eSrchResults.cfm?txtpid=&txtrelno=&prodline=all&frmdt=&txtsrchstring=aster%20analytics&srtord=Desc&todt=&rdSort=Date
https://www.youtube.com/watch?v=C760M9ttK9Q

  was:
Story

As a data scientist, I want to perform session reconstruction on my data set, 
so that I can prepare for input into other algorithms like path functions, or 
predictive analytics algorithms.

This is a follow on to 
https://issues.apache.org/jira/browse/MADLIB-909
https://issues.apache.org/jira/browse/MADLIB-1001
to add minimum time.

Details 

Add min time to the existing params:

params (optional)
TEXT, default: NULL. Parameters for sessionization in a comma-delimited string 
of key-value pairs. See the description below for details.

Parameters

Parameters in this section are supplied in the params argument as a string 
containing a comma-delimited list of key-value pairs. All of these named 
parameters are optional, and their order does not matter. You must use the 
format <param_name> = <value> to specify the value of a parameter, otherwise 
the parameter is ignored.

{code}
‘output_all_cols = <value>, -- existing param
 create_view = <value>,  -- existing param
 min_time = <value>'
{code} 

Parameters

min_time (optional)
FLOAT  Minimum delta time that must elapse between events in order for this 
session to be considered a real (human) session (default=0).  User should 
specify in the same units as the time_out parameter.   Usage:  if an event 
happens less than min_time, do not include the event in the current session.  
When you look at the next event, compare against the last valid session event, 
not against the one you just dropped.

For an example of how min_time could work, see Aster Analytics sessionization 
function [1].

References

[4] Aster Analytics users guide, see "sessionize" function
http://www.info.teradata.com/edownload.cfm?itemid=143450001
http://www.info.teradata.com/templates/eSrchResults.cfm?txtpid=&txtrelno=&prodline=all&frmdt=&txtsrchstring=aster%20analytics&srtord=Desc&todt=&rdSort=Date
https://www.youtube.com/watch?v=C760M9ttK9Q


> Sessionization - Phase 3 (minimum time)
> ---------------------------------------
>
>                 Key: MADLIB-1002
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1002
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Utilities
>            Reporter: Frank McQuillan
>            Priority: Minor
>              Labels: gsoc2016, starter
>             Fix For: v1.9.2
>
>
> Story
> As a data scientist, I want to perform session reconstruction on my data set, 
> so that I can prepare for input into other algorithms like path functions, or 
> predictive analytics algorithms.
> This is a follow on to 
> https://issues.apache.org/jira/browse/MADLIB-909
> https://issues.apache.org/jira/browse/MADLIB-1001
> to add minimum time.
> Details 
> Add min time to the existing params:
> {code}
> ‘output_all_cols = <value>, -- existing param
>  create_view = <value>,  -- existing param
>  min_time = <value>'
> {code} 
> Parameters
> min_time (optional)
> FLOAT  Minimum delta time that must elapse between events in order for this 
> session to be considered a real (human) session (default=0).  User should 
> specify in the same units as the time_out parameter.   Usage:  if an event 
> happens less than min_time, do not include the event in the current session.  
> When you look at the next event, compare against the last valid session 
> event, not against the one you just dropped.
> For an example of how min_time could work, see Aster Analytics sessionization 
> function [1].
> References
> [4] Aster Analytics users guide, see "sessionize" function
> http://www.info.teradata.com/edownload.cfm?itemid=143450001
> http://www.info.teradata.com/templates/eSrchResults.cfm?txtpid=&txtrelno=&prodline=all&frmdt=&txtsrchstring=aster%20analytics&srtord=Desc&todt=&rdSort=Date
> https://www.youtube.com/watch?v=C760M9ttK9Q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to