Hi Chenliang, Would you like to discuss your contribution during the next MADlib virtual community meeting? We can set it at an Asia friendly time. Does February 17th at 9:00AM CST (China Standard Time) work?
We usually begin the community meeting with a 20-minute presentation followed by Q&A session. You will have 20 minutes to talk about your contribution. Thanks, Karen ---------- Forwarded message ---------- From: chenliang wang <[email protected]> Date: Thu, Jan 14, 2016 at 8:18 AM Subject: Re: How to contribute a spatial module to MADlib manipulating objects from PostGIS To: [email protected] Cool! I'd like to join the next discussion. Best, Chenliang Wang On 01/13/2016 06:36 PM, Greg Chase wrote: > As I said, our next call is not China-friendly: > > http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%3CCAMg1VtnKB-WoyVqCstfMNCcJVOn2HKQQ6wNfqdovhgnB7zd5cw%40mail.gmail.com%3E > > This is this Friday, 10AM Pacifc Standard Time which is 2AM Saturday > Beijing time. > > We will arrange a next call in a couple weeks at an Asia friendly time to > support contributors in Asia. > > However, if you make the next call, we will make time for you to talk :) > > Regards, > > -Greg > > On Wed, Jan 13, 2016 at 2:18 AM, Kuien Liu <[email protected]> wrote: > > Great, I would like to join it, please send me an invitation if possible. >> >> Cheers, >> Kuien Liu >> >> On Wed, Jan 13, 2016 at 6:10 PM, Greg Chase <[email protected]> wrote: >> >> Perhaps ChenLiang would like to join a call with the MADlib community and >>> discuss his contribution? >>> >>> We have a call this Friday 10AM PST which is not a friendly time for >>> China, but we can schedule a next call at a friendlier time. >>> >>> This email encrypted by tiny buttons & fat thumbs, beta voice >>> recognition, and autocorrect on my iPhone. >>> >>> On Jan 13, 2016, at 1:53 AM, Ivan Novick <[email protected]> wrote: >>>> >>>> Cool! >>>> >>>> On Wed, Jan 13, 2016 at 5:52 PM, Kuien Liu <[email protected]> wrote: >>>>> >>>>> Got it, I think I can have a (f2f) talk with Chenliang Wang, as he was >>>>> graduated from an institute of CAS which is not far from our Beijing >>>>> office, and I am familiar with his supervisor and lab director. So I >>>>> >>>> think >>> >>>> it is highly possible to find him directly in Beijing. >>>>> >>>>> Cheers, >>>>> Kuien Liu >>>>> >>>>> On Wed, Jan 13, 2016 at 3:05 PM, Ivan Novick <[email protected]> >>>>>> >>>>> wrote: >>> >>>> Hello ChenLiang, >>>>>> >>>>>> I have read your description of the interface and to my understanding >>>>>> this is a supervised machine learning algorithm that supports geometry >>>>>> data. Am I correct? >>>>>> >>>>>> What could be a good industrial use case for this model for some >>>>>> examples? Could you train a system based on locations and weather to >>>>>> >>>>> find >>> >>>> bad signals for cell phone? Can you provide any real world example >>>>>> scenario where this type of model will be useful for end users? >>>>>> >>>>>> Also I am adding CC to some of my colleagues at work. Kuien, Max, >>>>>> Yandong can you provide any feedback on this proposal from your Point >>>>>> >>>>> of >>> >>>> View? >>>>>> >>>>>> >>>>>> >>>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%[email protected]%3E >>> >>>> Cheers, >>>>>> Ivan >>>>>> >>>>>> >>>>>> On Wed, Jan 13, 2016 at 11:20 AM, WangChenLiang <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Sorry, the link of attachment (http://1drv.ms/1ZjAiCg) is lost in >>>>>>> >>>>>> the >>> >>>> previous letter. >>>>>>> >>>>>>> From: [email protected] >>>>>>>> To: [email protected] >>>>>>>> Subject: RE: How to contribute a spatial module to MADlib >>>>>>>> >>>>>>> manipulating >>> >>>> objects from PostGIS >>>>>>> >>>>>>>> Date: Wed, 13 Jan 2016 11:09:17 +0800 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi ,Caleb and Ivan! >>>>>>>> Thanks for your attention and help. I reviewed the previous draft >>>>>>>> >>>>>>> and find >>>>>>> >>>>>>>> something inappropriate. The archive containing the new draft and >>>>>>>> >>>>>>> example code >>>>>>> >>>>>>>> is attached in the letter which would be more reasonable than the >>>>>>>> >>>>>>> earlier edition. >>>>>>> >>>>>>>> Please go over the manuscript and give suggestion again . >>>>>>>> The following are my answers to Caleb's questions. >>>>>>>> - Does this function require PostGIS to also be >>>>>>>> installed? If yes, it would be better >>>>>>>> if we disable the function if >>>>>>>> PostGIS is not present rather than introduce PostGIS >>>>>>>> as a dependency. (Similar >>>>>>>> to what we do with our requirement on the xml module with our PMML >>>>>>>> >>>>>>> export >>>>>>> >>>>>>>> functionality). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> A:Yes. I am trying to avoid >>>>>>>> input any spatial datatypes in the interface of GWR. >>>>>>>> But I have no >>>>>>>> idea if it is necessary to provide simple alternative when PostGIS >>>>>>>> >>>>>>> is >>> >>>> not >>>>>>> >>>>>>>> available. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - What are the exact datatypes in the function >>>>>>>> definition for regression_location >>>>>>>> and prediction_location? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> A:I changed the datatype >>>>>>>> to TEXT as the name of POINT or MULTIPOLYGON >>>>>>>> (centroid of >>>>>>>> each polygon for estimation for GWR). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - In the description it describes >>>>>>>> regression_location as "The length of >>>>>>>> regression_location must be equal to the length of >>>>>>>> source_table", which signals to me that it is likely intended to be >>>>>>>> >>>>>>> a >>> >>>> column of the source table? If not then how is >>>>>>>> this length represented? >>>>>>>> >>>>>>>> >>>>>>>> A: In the previous >>>>>>>> interface, I was trying to input a geometry field which could be >>>>>>>> from another >>>>>>>> table having different row number. Now, I alter the argument >>>>>>>> definition and make it >>>>>>>> to TEXT. It must be the name of geometry field in the >>>>>>>> source table. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - You didn't mark regression_location as >>>>>>>> (optional). Due to the way Postgres >>>>>>>> functions work all optional arguments >>>>>>>> must come after all required arguments, >>>>>>>> so having a non-optional argument in >>>>>>>> the middle of the optional list must be >>>>>>>> avoided. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> A:Thanks for >>>>>>>> reminding me of this mistake. It is really my fault. The order of >>>>>>>> argument is changed in this edition. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - I haven't read through the literature, but it is >>>>>>>> not immediately clear to me why >>>>>>>> prediction_location is a parameter to >>>>>>>> gwregr_train() rather than gwregr_predict(). >>>>>>>> Can you provide a brief >>>>>>>> description to the way that prediction_location is used in >>>>>>>> the model and its >>>>>>>> relationship to training and prediction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> A: Actually, >>>>>>>> there are three kinds location data including location of sample >>>>>>>> >>>>>>> data, >>> >>>> regression and prediction in the modeling of GWR. >>>>>>>> >>>>>>>> Locations of sample data indicate where is sample >>>>>>>> data. Locations of regression >>>>>>>> indicate where regression should be conducted. If >>>>>>>> it is identical to data location >>>>>>>> (in most instances),diagnostic information can >>>>>>>> be calculated. >>>>>>>> >>>>>>>> Locations of >>>>>>>> prediction indicate where coefficients should be predicted. It >>>>>>>> >>>>>>> should >>> >>>> be a >>>>>>> >>>>>>>> parameter for a predict function. Putting regression_location into >>>>>>>> >>>>>>> training >>>>>>> >>>>>>>> function is just for omitting kernel arguments and maybe not >>>>>>>> >>>>>>> appropriate. In the process of >>>>>>> >>>>>>>> training, GWR estimates weight and coefficients with distance >>>>>>>> between data_loctions and regression_loctions. Then, diagnostic >>>>>>>> >>>>>>> information are >>>>>>> >>>>>>>> estimated when these two locations are identical. We can treat >>>>>>>> >>>>>>> data_locationas regression_location to simplify the process not >>>>>>> >>>>>> taking >>> >>>> different locations from >>>>>>> >>>>>>>> data location in the training step. >>>>>>>> >>>>>>>> In the process of >>>>>>>> prediction , there are two new information including new >>>>>>>> independent variables and new locations. Therefore, coefficients and >>>>>>>> >>>>>>> weight >>>>>>> >>>>>>>> vector must be estimated >>>>>>>> again. GWR can >>>>>>>> estimate coefficients in any positions >>>>>>>> using independent variables of sample data. >>>>>>>> If we also provide independent >>>>>>>> variables in any positions,we can also obtain >>>>>>>> dependent variable in any position. So if we treat coefficients at >>>>>>>> >>>>>>> prediction_location as a training result to put >>>>>>> >>>>>>>> coefficients into prediction >>>>>>>> directly, it is reasonable to put it into training function. But if >>>>>>>> >>>>>>> we >>> >>>> treat it as a part of prediction, it is appropriate to set >>>>>>> predicton_location within predict function. And then, prediction >>>>>>> >>>>>> function >>> >>>> must require kernel >>>>>>> >>>>>>>> parameters in addition to new data and locations for prediction. >>>>>>>> >>>>>>> Maybe >>> >>>> this way >>>>>>> >>>>>>>> is more clear >>>>>>>> and reasonable, and is similar with others GWR packages in R. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I >>>>>>>> rewrote the description of interface taking your suggestion into >>>>>>>> >>>>>>> account. I >>>>>>> >>>>>>>> moved >>>>>>>> prediction_location into predict function and modified >>>>>>>> some mistake and >>>>>>>> unnecessary arguments. The new draft of interface design is >>>>>>>> >>>>>>> attached >>> >>>> in the >>>>>>> >>>>>>>> letter. >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> ChenLiang Wang >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> From: [email protected] >>>>>>>>> Date: Tue, 5 Jan 2016 10:31:20 -0800 >>>>>>>>> Subject: Re: How to contribute a spatial module to MADlib >>>>>>>>> >>>>>>>> manipulating objects from PostGIS >>>>>>> >>>>>>>> To: [email protected] >>>>>>>>> >>>>>>>>> Hi ChenLiang, >>>>>>>>> >>>>>>>>> Thanks for taking the next step to flush this out. >>>>>>>>> >>>>>>>>> As a whole: >>>>>>>>> - naming and basic interface seems consistent with existing >>>>>>>>> >>>>>>>> conventions. >>>>>>> >>>>>>>> - names are descriptive. >>>>>>>>> - references to the literature is provided. >>>>>>>>> - functionality is complementary to the library. >>>>>>>>> >>>>>>>>> What is not clear to me is: >>>>>>>>> - Does this function require PostGIS to also be installed? If yes, >>>>>>>>> >>>>>>>> it >>>>>>> >>>>>>>> would be better if we disable the function if PostGIS is not >>>>>>>>> >>>>>>>> present >>> >>>> rather >>>>>>> >>>>>>>> than introduce PostGIS as a dependency. (Similar to what we do >>>>>>>>> >>>>>>>> with >>> >>>> our >>>>>>> >>>>>>>> requirement on the xml module with our PMML export functionality). >>>>>>>>> - What are the exact datatypes in the function definition for >>>>>>>>> regression_location and prediction_location? >>>>>>>>> - In the description it describes regression_location as "The >>>>>>>>> >>>>>>>> length >>> >>>> of >>>>>>> >>>>>>>> regression_location must be equal to the length of source_table", >>>>>>>>> >>>>>>>> which >>>>>>> >>>>>>>> signals to me that it is likely intended to be a column of the >>>>>>>>> >>>>>>>> source >>> >>>> table? If not then how is this length represented? >>>>>>>>> - You didn't mark regression_location as (optional). Due to the >>>>>>>>> >>>>>>>> way >>> >>>> Postgres functions work all optional arguments must come after all >>>>>>>>> >>>>>>>> required >>>>>>> >>>>>>>> arguments, so having a non-optional argument in the middle of the >>>>>>>>> >>>>>>>> optional >>>>>>> >>>>>>>> list must be avoided. >>>>>>>>> - I haven't read through the literature, but it is not immediately >>>>>>>>> >>>>>>>> clear to >>>>>>> >>>>>>>> me why prediction_location is a parameter to gwregr_train() rather >>>>>>>>> >>>>>>>> than >>>>>>> >>>>>>>> gwregr_predict(). Can you provide a brief description to the way >>>>>>>>> >>>>>>>> that >>>>>>> >>>>>>>> prediction_location is used in the model and its relationship to >>>>>>>>> >>>>>>>> training >>>>>>> >>>>>>>> and prediction. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Caleb >>>>>>>>> >>>>>>>> ChenLiang 要与你在 OneDrive >>>>>>>> >>>>>>> 上共享一个文件。要查看该文件,请单击下面的链接。 >>>>>>> gwr4madlib.rar >>>>>>> >>>>>> >>
