Perhaps ChenLiang would like to join a call with the MADlib community and 
discuss his contribution?

We have a call this Friday 10AM PST which is not a friendly time for China, but 
we can schedule a next call at a friendlier time.

This email encrypted by tiny buttons & fat thumbs, beta voice recognition, and 
autocorrect on my iPhone.

> On Jan 13, 2016, at 1:53 AM, Ivan Novick <[email protected]> wrote:
> 
> Cool!
> 
>> On Wed, Jan 13, 2016 at 5:52 PM, Kuien Liu <[email protected]> wrote:
>> 
>> Got it, I think I can have a (f2f) talk with Chenliang Wang, as he was
>> graduated from an institute of CAS which is not far from our Beijing
>> office, and I am familiar with his supervisor and lab director. So I think
>> it is highly possible to find him directly in Beijing.
>> 
>> Cheers,
>> Kuien Liu
>> 
>>> On Wed, Jan 13, 2016 at 3:05 PM, Ivan Novick <[email protected]> wrote:
>>> 
>>> Hello ChenLiang,
>>> 
>>> I have read your description of the interface and to my understanding
>>> this is a supervised machine learning algorithm that supports geometry
>>> data.  Am I correct?
>>> 
>>> What could be a good industrial use case for this model for some
>>> examples?  Could you train a system based on locations and weather to find
>>> bad signals for cell phone?  Can you provide any real world example
>>> scenario where this type of model will be useful for end users?
>>> 
>>> Also I am adding CC to some of my colleagues at work.  Kuien, Max,
>>> Yandong can you provide any feedback on this proposal from your Point of
>>> View?
>>> 
>>> 
>>> http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%[email protected]%3E
>>> 
>>> Cheers,
>>> Ivan
>>> 
>>> 
>>> On Wed, Jan 13, 2016 at 11:20 AM, WangChenLiang <[email protected]>
>>> wrote:
>>> 
>>>> Sorry, the link of attachment (http://1drv.ms/1ZjAiCg) is lost in the
>>>> previous letter.
>>>> 
>>>>> From: [email protected]
>>>>> To: [email protected]
>>>>> Subject: RE: How to contribute a spatial module to MADlib manipulating
>>>> objects from PostGIS
>>>>> Date: Wed, 13 Jan 2016 11:09:17 +0800
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi   ,Caleb and Ivan!
>>>>>   Thanks for your attention and help. I reviewed the previous draft
>>>> and find
>>>>> something inappropriate. The archive containing the new draft and
>>>> example code
>>>>> is attached in the letter which would be more reasonable  than the
>>>> earlier edition.
>>>>> Please go over the manuscript and give suggestion again .
>>>>> The following are my answers to Caleb's questions.
>>>>> - Does this function require PostGIS to also be
>>>>> installed? If yes, it would be better
>>>>> if we disable the function if
>>>>> PostGIS is not present rather than introduce PostGIS
>>>>> as a dependency. (Similar
>>>>> to what we do with our requirement on the xml module with our PMML
>>>> export
>>>>> functionality).
>>>>> 
>>>>> 
>>>>> 
>>>>> A:Yes. I am trying to avoid
>>>>> input any spatial datatypes in the interface of GWR.
>>>>> But I have no
>>>>> idea if it is necessary to provide simple alternative when PostGIS is
>>>> not
>>>>> available.
>>>>> 
>>>>> 
>>>>> 
>>>>> - What are the exact datatypes in the function
>>>>> definition for regression_location
>>>>> and prediction_location?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> A:I changed the datatype
>>>>> to TEXT as the name of POINT or MULTIPOLYGON
>>>>> (centroid of
>>>>> each polygon for estimation for GWR).
>>>>> 
>>>>> 
>>>>> 
>>>>> - In the description it describes
>>>>> regression_location as "The length of
>>>>> regression_location must be equal to the length of
>>>>> source_table", which signals to me that it is likely intended to be a
>>>>> column of the source table? If not then how is
>>>>> this length represented?
>>>>> 
>>>>> 
>>>>> A: In the previous
>>>>> interface, I was trying to input a geometry field which could be
>>>>> from another
>>>>> table having different row number. Now, I alter the argument
>>>>> definition and make it
>>>>> to TEXT. It must be the name of geometry field in the
>>>>> source table.
>>>>> 
>>>>> 
>>>>> 
>>>>> - You didn't mark regression_location as
>>>>> (optional). Due to the way Postgres
>>>>> functions work all optional arguments
>>>>> must come after all required arguments,
>>>>> so having a non-optional argument in
>>>>> the middle of the optional list must be
>>>>> avoided.
>>>>> 
>>>>> 
>>>>> 
>>>>> A:Thanks for
>>>>> reminding me of this mistake. It is really my fault. The order of
>>>>> argument is changed in this edition.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> - I haven't read through the literature, but it is
>>>>> not immediately clear to me why
>>>>> prediction_location is a parameter to
>>>>> gwregr_train() rather than gwregr_predict().
>>>>> Can you provide a brief
>>>>> description to the way that prediction_location is used in
>>>>> the model and its
>>>>> relationship to training and prediction.
>>>>> 
>>>>> 
>>>>> 
>>>>> A: Actually,
>>>>> there are three kinds location data including location of sample data,
>>>>> regression and prediction in the modeling of GWR.
>>>>> 
>>>>> Locations of sample data indicate where is sample
>>>>> data. Locations of regression
>>>>> indicate where regression should be conducted. If
>>>>> it is identical to data location
>>>>> (in most instances),diagnostic information can
>>>>> be calculated.
>>>>> 
>>>>> Locations of
>>>>> prediction indicate where coefficients should be predicted. It should
>>>> be a
>>>>> parameter for a predict function. Putting regression_location into
>>>> training
>>>>> function is just for omitting kernel arguments and maybe not
>>>> appropriate.  In the process of
>>>>> training, GWR estimates weight and coefficients with distance
>>>>> between data_loctions and regression_loctions. Then, diagnostic
>>>> information are
>>>>> estimated when these two locations are identical. We can treat
>>>> data_locationas regression_location to simplify the process not taking
>>>> different locations from
>>>>> data location in the training step.
>>>>> 
>>>>>  In the process of
>>>>> prediction , there are two new information including new
>>>>> independent variables and new locations. Therefore, coefficients and
>>>> weight
>>>>> vector must be estimated
>>>>> again. GWR can
>>>>> estimate coefficients in any positions
>>>>> using independent variables of sample data.
>>>>> If we also provide independent
>>>>> variables in any positions,we can also obtain
>>>>> dependent variable in any position. So if we treat coefficients at
>>>> prediction_location as a training result to put
>>>>> coefficients into prediction
>>>>> directly, it is reasonable to put it into training function. But if we
>>>> treat it as a part of prediction, it is appropriate to set
>>>> predicton_location within predict function. And then, prediction function
>>>> must require kernel
>>>>> parameters in addition to new data and locations for prediction. Maybe
>>>> this way
>>>>> is more clear
>>>>> and reasonable, and is similar with others GWR packages in R.
>>>>> 
>>>>> 
>>>>> 
>>>>>       I
>>>>> rewrote the description of interface taking your suggestion into
>>>> account. I
>>>>> moved
>>>>> prediction_location into predict function and modified
>>>>> some mistake and
>>>>> unnecessary arguments.  The new draft of interface design is attached
>>>> in the
>>>>> letter.
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> ChenLiang Wang
>>>>> 
>>>>> 
>>>>> 
>>>>>> From: [email protected]
>>>>>> Date: Tue, 5 Jan 2016 10:31:20 -0800
>>>>>> Subject: Re: How to contribute a spatial module to MADlib
>>>> manipulating objects from PostGIS
>>>>>> To: [email protected]
>>>>>> 
>>>>>> Hi ChenLiang,
>>>>>> 
>>>>>> Thanks for taking the next step to flush this out.
>>>>>> 
>>>>>> As a whole:
>>>>>> - naming and basic interface seems consistent with existing
>>>> conventions.
>>>>>> - names are descriptive.
>>>>>> - references to the literature is provided.
>>>>>> - functionality is complementary to the library.
>>>>>> 
>>>>>> What is not clear to me is:
>>>>>> - Does this function require PostGIS to also be installed?  If yes,
>>>> it
>>>>>> would be better if we disable the function if PostGIS is not present
>>>> rather
>>>>>> than introduce PostGIS as a dependency.  (Similar to what we do with
>>>> our
>>>>>> requirement on the xml module with our PMML export functionality).
>>>>>> - What are the exact datatypes in the function definition for
>>>>>> regression_location and prediction_location?
>>>>>> - In the description it describes regression_location as "The length
>>>> of
>>>>>> regression_location must be equal to the length of source_table",
>>>> which
>>>>>> signals to me that it is likely intended to be a column of the source
>>>>>> table?  If not then how is this length represented?
>>>>>> - You didn't mark regression_location as (optional).  Due to the way
>>>>>> Postgres functions work all optional arguments must come after all
>>>> required
>>>>>> arguments, so having a non-optional argument in the middle of the
>>>> optional
>>>>>> list must be avoided.
>>>>>> - I haven't read through the literature, but it is not immediately
>>>> clear to
>>>>>> me why prediction_location is a parameter to gwregr_train() rather
>>>> than
>>>>>> gwregr_predict().  Can you provide a brief description to the way
>>>> that
>>>>>> prediction_location is used in the model and its relationship to
>>>> training
>>>>>> and prediction.
>>>>>> 
>>>>>> Regards,
>>>>>>  Caleb
>>>>>                                    ChenLiang 要与你在 OneDrive
>>>> 上共享一个文件。要查看该文件,请单击下面的链接。
>>>>                                                    gwr4madlib.rar
>> 

Reply via email to