Cool! On Wed, Jan 13, 2016 at 5:52 PM, Kuien Liu <[email protected]> wrote:
> Got it, I think I can have a (f2f) talk with Chenliang Wang, as he was > graduated from an institute of CAS which is not far from our Beijing > office, and I am familiar with his supervisor and lab director. So I think > it is highly possible to find him directly in Beijing. > > Cheers, > Kuien Liu > > On Wed, Jan 13, 2016 at 3:05 PM, Ivan Novick <[email protected]> wrote: > >> Hello ChenLiang, >> >> I have read your description of the interface and to my understanding >> this is a supervised machine learning algorithm that supports geometry >> data. Am I correct? >> >> What could be a good industrial use case for this model for some >> examples? Could you train a system based on locations and weather to find >> bad signals for cell phone? Can you provide any real world example >> scenario where this type of model will be useful for end users? >> >> Also I am adding CC to some of my colleagues at work. Kuien, Max, >> Yandong can you provide any feedback on this proposal from your Point of >> View? >> >> >> http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%[email protected]%3E >> >> Cheers, >> Ivan >> >> >> On Wed, Jan 13, 2016 at 11:20 AM, WangChenLiang <[email protected]> >> wrote: >> >>> Sorry, the link of attachment (http://1drv.ms/1ZjAiCg) is lost in the >>> previous letter. >>> >>> > From: [email protected] >>> > To: [email protected] >>> > Subject: RE: How to contribute a spatial module to MADlib manipulating >>> objects from PostGIS >>> > Date: Wed, 13 Jan 2016 11:09:17 +0800 >>> > >>> > >>> > >>> > Hi ,Caleb and Ivan! >>> > Thanks for your attention and help. I reviewed the previous draft >>> and find >>> > something inappropriate. The archive containing the new draft and >>> example code >>> > is attached in the letter which would be more reasonable than the >>> earlier edition. >>> > Please go over the manuscript and give suggestion again . >>> > The following are my answers to Caleb's questions. >>> > - Does this function require PostGIS to also be >>> > installed? If yes, it would be better >>> > if we disable the function if >>> > PostGIS is not present rather than introduce PostGIS >>> > as a dependency. (Similar >>> > to what we do with our requirement on the xml module with our PMML >>> export >>> > functionality). >>> > >>> > >>> > >>> > A:Yes. I am trying to avoid >>> > input any spatial datatypes in the interface of GWR. >>> > But I have no >>> > idea if it is necessary to provide simple alternative when PostGIS is >>> not >>> > available. >>> > >>> > >>> > >>> > - What are the exact datatypes in the function >>> > definition for regression_location >>> > and prediction_location? >>> > >>> > >>> > >>> > >>> > >>> > A:I changed the datatype >>> > to TEXT as the name of POINT or MULTIPOLYGON >>> > (centroid of >>> > each polygon for estimation for GWR). >>> > >>> > >>> > >>> > - In the description it describes >>> > regression_location as "The length of >>> > regression_location must be equal to the length of >>> > source_table", which signals to me that it is likely intended to be a >>> > column of the source table? If not then how is >>> > this length represented? >>> > >>> > >>> > A: In the previous >>> > interface, I was trying to input a geometry field which could be >>> > from another >>> > table having different row number. Now, I alter the argument >>> > definition and make it >>> > to TEXT. It must be the name of geometry field in the >>> > source table. >>> > >>> > >>> > >>> > - You didn't mark regression_location as >>> > (optional). Due to the way Postgres >>> > functions work all optional arguments >>> > must come after all required arguments, >>> > so having a non-optional argument in >>> > the middle of the optional list must be >>> > avoided. >>> > >>> > >>> > >>> > A:Thanks for >>> > reminding me of this mistake. It is really my fault. The order of >>> > argument is changed in this edition. >>> > >>> > >>> > >>> > >>> > - I haven't read through the literature, but it is >>> > not immediately clear to me why >>> > prediction_location is a parameter to >>> > gwregr_train() rather than gwregr_predict(). >>> > Can you provide a brief >>> > description to the way that prediction_location is used in >>> > the model and its >>> > relationship to training and prediction. >>> > >>> > >>> > >>> > A: Actually, >>> > there are three kinds location data including location of sample data, >>> > regression and prediction in the modeling of GWR. >>> > >>> > Locations of sample data indicate where is sample >>> > data. Locations of regression >>> > indicate where regression should be conducted. If >>> > it is identical to data location >>> > (in most instances),diagnostic information can >>> > be calculated. >>> > >>> > Locations of >>> > prediction indicate where coefficients should be predicted. It should >>> be a >>> > parameter for a predict function. Putting regression_location into >>> training >>> > function is just for omitting kernel arguments and maybe not >>> appropriate. In the process of >>> > training, GWR estimates weight and coefficients with distance >>> > between data_loctions and regression_loctions. Then, diagnostic >>> information are >>> > estimated when these two locations are identical. We can treat >>> data_locationas regression_location to simplify the process not taking >>> different locations from >>> > data location in the training step. >>> > >>> > In the process of >>> > prediction , there are two new information including new >>> > independent variables and new locations. Therefore, coefficients and >>> weight >>> > vector must be estimated >>> > again. GWR can >>> > estimate coefficients in any positions >>> > using independent variables of sample data. >>> > If we also provide independent >>> > variables in any positions,we can also obtain >>> > dependent variable in any position. So if we treat coefficients at >>> prediction_location as a training result to put >>> > coefficients into prediction >>> > directly, it is reasonable to put it into training function. But if we >>> treat it as a part of prediction, it is appropriate to set >>> predicton_location within predict function. And then, prediction function >>> must require kernel >>> > parameters in addition to new data and locations for prediction. Maybe >>> this way >>> > is more clear >>> > and reasonable, and is similar with others GWR packages in R. >>> > >>> > >>> > >>> > I >>> > rewrote the description of interface taking your suggestion into >>> account. I >>> > moved >>> > prediction_location into predict function and modified >>> > some mistake and >>> > unnecessary arguments. The new draft of interface design is attached >>> in the >>> > letter. >>> > >>> > >>> > Regards, >>> > >>> > ChenLiang Wang >>> > >>> > >>> > >>> > > From: [email protected] >>> > > Date: Tue, 5 Jan 2016 10:31:20 -0800 >>> > > Subject: Re: How to contribute a spatial module to MADlib >>> manipulating objects from PostGIS >>> > > To: [email protected] >>> > > >>> > > Hi ChenLiang, >>> > > >>> > > Thanks for taking the next step to flush this out. >>> > > >>> > > As a whole: >>> > > - naming and basic interface seems consistent with existing >>> conventions. >>> > > - names are descriptive. >>> > > - references to the literature is provided. >>> > > - functionality is complementary to the library. >>> > > >>> > > What is not clear to me is: >>> > > - Does this function require PostGIS to also be installed? If yes, >>> it >>> > > would be better if we disable the function if PostGIS is not present >>> rather >>> > > than introduce PostGIS as a dependency. (Similar to what we do with >>> our >>> > > requirement on the xml module with our PMML export functionality). >>> > > - What are the exact datatypes in the function definition for >>> > > regression_location and prediction_location? >>> > > - In the description it describes regression_location as "The length >>> of >>> > > regression_location must be equal to the length of source_table", >>> which >>> > > signals to me that it is likely intended to be a column of the source >>> > > table? If not then how is this length represented? >>> > > - You didn't mark regression_location as (optional). Due to the way >>> > > Postgres functions work all optional arguments must come after all >>> required >>> > > arguments, so having a non-optional argument in the middle of the >>> optional >>> > > list must be avoided. >>> > > - I haven't read through the literature, but it is not immediately >>> clear to >>> > > me why prediction_location is a parameter to gwregr_train() rather >>> than >>> > > gwregr_predict(). Can you provide a brief description to the way >>> that >>> > > prediction_location is used in the model and its relationship to >>> training >>> > > and prediction. >>> > > >>> > > Regards, >>> > > Caleb >>> > ChenLiang 要与你在 OneDrive >>> 上共享一个文件。要查看该文件,请单击下面的链接。 >>> gwr4madlib.rar >>> >>> >> >> >
