[GitHub] incubator-madlib pull request #138: Summary: Add param to determine num of c...

rashmi815 Wed, 07 Jun 2017 16:41:10 -0700

Github user rashmi815 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/138#discussion_r120770959
  
    --- Diff: src/ports/postgres/modules/summary/summary.py_in ---
    @@ -7,84 +7,73 @@
     """
     import plpy
     from time import time
    -from utilities.utilities import __mad_version
    +
    +from utilities.control import MinWarning
     from Summarizer import Summarizer
    -version_wrapper = __mad_version()
    -_get_vector = version_wrapper.select_vecfunc()
    +
     
     def summary(schema_madlib, source_table, output_table, target_cols, 
grouping_cols,
    -    get_distinct, get_quartiles, ntile_array, how_many_mfv, get_estimates):
    +            get_distinct, get_quartiles, ntile_array, how_many_mfv,
    +            get_estimates, n_cols_per_run):
         """
    -        Main summary function that is called by SQL to execute summary
    +        Main summary function that is called by SQL to compute summary
             statistics on a table.
     
    -        @param schema_madlib        Madlib Schema namespace
    -        @param source_table         Name of input table
    -        @param output_table         Name of output table
    -        @param target_cols          Names of specific columns for which to 
get summary
    -        @param grouping_cols        Names of columns on which to group-by
    -                                        (no summary is provided for these 
columns)
    -        @param get_distinct         Should summary include distinct count
    -        @param get_quartiles        Should summary include quartile 
information
    -        @param ntile_array          Array for quantiles to include in 
summary
    -                                        (each element should be in [0, 1])
    -        @param how_many_mfv         How many frequent values to output?
    -        @param get_estimates        Should the summmary information be 
estimated or exact?
    +        @param schema_madlib   Madlib Schema namespace
    +        @param source_table    Name of input table
    +        @param output_table    Name of output table
    +        @param target_cols     Names of specific columns for which to get 
summary
    +        @param grouping_cols   Names of columns on which to group-by
    +                                   (no summary is provided for these 
columns)
    +        @param get_distinct    Should summary include distinct count
    +        @param get_quartiles   Should summary include quartile information
    +        @param ntile_array     Array for quantiles to include in summary
    +                                   (each element should be in [0, 1])
    +        @param how_many_mfv    How many frequent values to output?
    +        @param get_estimates   Should the summmary information be 
estimated or exact?
    --- End diff --
    
    Should there be an entry here for n_cols_per_run?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-madlib pull request #138: Summary: Add param to determine num of c...

Reply via email to