[R] Odp: Finding pairs
Hi without other details it is probably impossible to give you any reasonable advice. Do you have your data already in R? What is their form? Are they in 2 columns in data frame? How did you get them paired? So without some more information probably nobody will invest his time as it seems no trivial to me. Regards Petr r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42: Dear R Helpers, I am a newbie and recently got introduced to R. I have a large database containing the names of bank branch offices along-with other details. I am into Operational Risk as envisaged by BASEL II Accord. I am trying to express my problem and I am using only an indicative data which comes in coded format. A (branch) B (controlled by) 144 145 146 147 144 148 145 149 147 151 146 .. ... .. ... where 144's etc are branch codes in a given city and B is subset of A. If a branch code appearing in A also appears in B (which is paired with some otehr element of A e.g. 144 appearing in A, also appears in B and is paired with 147 of A and likewise), then that means 144 is controlling operations of bank office 147. Again, 147 itself appears again in B and is paired with bank branch coded 149. Thus, 149 is controlled by 147 and 147 is controlled by 144. Likewise there are more than 700 hundred branch name codes available. My objective is to group them as follows - Bank Branch 144 147149 145 146 151 148 . or even the following output will do. 144 147 149 145 146 151 148 151 .. I understand I should be writing some R code to begin with which I had tried also but as of now I am helpless. Please guide me. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Finding pairs
Dear Mr Petr Pikal I am extremely sorry for the manner I have raised the query. Actually that was my first post to this R forum and in fact even I was also bit confused while drafting the query, for which I really owe sorry to all for consuming the precious time. Perhaps I will try to redraft my query in a better way as follows. I have two datasets A and B containing the names of branch offices of a particular bank say XYZ plc bank. The XYZ bank has number of main branch offices (say Parent) and some small branch offices falling under the purview of these main branch offices (say Child). The datalist A and B consists of these main branch office names as well as small branch office names. B is subset of A and these branch names are coded. Thus we have two datasets A and B as (again I am using only a portion of a large database just to have some idea) A B 144 145 146 147 144 148 145 149 147 151 148 Now the branch 144 appears in A as well as in B and in B it is mapped with 147. This means branch 147 comes under the purview of main branch 144. Again 147 is controlling the branch 149 (since 147 also has appeared in B and is mapped with 149 of A). Similarly, branch 145 is controlling branch 148 which further controls operations of bank branch 151 and like wise. So in the end I need an output something like - Main Branch Branch office1 Branch office2 144 147 149 145 148 151 146 NA NA ... .. I understand again I am not able to put forward my query properly. But I must thank all of you for giving a patient reading to my query and for reverting back earlier. Thanks once again. With warmest regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 6:39 Hi without other details it is probably impossible to give you any reasonable advice. Do you have your data already in R? What is their form? Are they in 2 columns in data frame? How did you get them paired? So without some more information probably nobody will invest his time as it seems no trivial to me. Regards Petr r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42: Dear R Helpers, I am a newbie and recently got introduced to R. I have a large database containing the names of bank branch offices along-with other details. I am into Operational Risk as envisaged by BASEL II Accord. I am trying to express my problem and I am using only an indicative data which comes in coded format. A (branch) B (controlled by) 144 145 146 147 144 148 145 149 147 151 146 .. ... .. ... where 144's etc are branch codes in a given city and B is subset of A. If a branch code appearing in A also appears in B (which is paired with some otehr element of A e.g. 144 appearing in A, also appears in B and is paired with 147 of A and likewise), then that means 144 is controlling operations of bank office 147. Again, 147 itself appears again in B and is paired with bank branch coded 149. Thus, 149 is controlled by 147 and 147 is controlled by 144. Likewise there are more than 700 hundred branch name codes available. My objective is to group them as follows - Bank Branch 144 147 149 145 146 151 148 . or even the following output will do. 144 147 149 145 146 151 148 151 .. I understand I should be writing some R code to begin with which I had tried also but as of now I am helpless. Please guide me. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] Odp: Finding pairs
Hm r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26: Dear Mr Petr Pikal I am extremely sorry for the manner I have raised the query. Actually that was my first post to this R forum and in fact even I was also bit confused while drafting the query, for which I really owe sorry to all for consuming the precious time. Perhaps I will try to redraft my query in a better way as follows. I have two datasets A and B containing the names of branch offices of a particular bank say XYZ plc bank. The XYZ bank has number of main branch offices (say Parent) and some small branch offices falling under the purview of these main branch offices (say Child). The datalist A and B consists of these main branch office names as well as small branch office names. B is subset of A and these branch names are coded. Thus we have two datasets A and B as (again I am using only a portion of a large database just to have some idea) A B 144 what is here in B? Empty space?, 145 146 147 144 How do you know that 144 from B relates to 147 in A? Is it according to its positions? I.e. 4th item in B belongs to 4.th item in A? 148 145 149 147 151 148 Now the branch 144 appears in A as well as in B and in B it is mapped with 147. This means branch 147 comes under the purview of main branch 144. Again 147 is controlling the branch 149 (since 147 also has appeared in B and is mapped with 149 of A). Similarly, branch 145 is controlling branch 148 which further controls operations of bank branch 151 and like wise. Well as you did not say anything about structure of your data A-144:151 B-144:148 data.frame(A,B) A B 1 144 NA 2 145 NA 3 146 NA 4 147 144 5 148 145 6 149 146 7 150 147 8 151 148 DF-data.frame(A,B) main-DF$A[is.na(DF$B)] branch1-DF[!is.na(DF$B),] selected.branch1-branch1$A[branch1$B%in%main] branch2-branch1[!branch1$B%in%main,] selected.branch2-branch2$A[branch2$B%in%selected.branch1] and for cbinding your data which has uneven number of values see Jim Holtman's answer to this How to cbind DF:s with differing number of rows? Regards Petr So in the end I need an output something like - Main Branch Branch office1 Branch office2 144 147 149 145 148 151 146 NA NA ... .. I understand again I am not able to put forward my query properly. But I must thank all of you for giving a patient reading to my query and for reverting back earlier. Thanks once again. With warmest regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 6:39 Hi without other details it is probably impossible to give you any reasonable advice. Do you have your data already in R? What is their form? Are they in 2 columns in data frame? How did you get them paired? So without some more information probably nobody will invest his time as it seems no trivial to me. Regards Petr r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42: Dear R Helpers, I am a newbie and recently got introduced to R. I have a large database containing the names of bank branch offices along-with other details. I am into Operational Risk as envisaged by BASEL II Accord. I am trying to express my problem and I am using only an indicative data which comes in coded format. A (branch) B (controlled by) 144 145 146 147 144 148 145 149 147 151 146 .. ... .. ... where 144's etc are branch codes in a given city and B is subset of A. If a branch code appearing in A also appears in B (which is paired with some otehr element of A e.g. 144 appearing in A, also appears in B and is paired with 147 of A and likewise), then that means 144 is controlling operations of bank office 147. Again, 147 itself appears again
Re: [R] Odp: Finding pairs
Dear Mr Petr PIKAL After reading the R code provided by you, I realized that I would have never figured out how this could have been done. I am going to re-read again and again your code to understand the logic and the commands you have provided. Thanks again from the heart for your kind advice. Regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Re: [R] Odp: Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 9:01 Hm r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26: Dear Mr Petr Pikal I am extremely sorry for the manner I have raised the query. Actually that was my first post to this R forum and in fact even I was also bit confused while drafting the query, for which I really owe sorry to all for consuming the precious time. Perhaps I will try to redraft my query in a better way as follows. I have two datasets A and B containing the names of branch offices of a particular bank say XYZ plc bank. The XYZ bank has number of main branch offices (say Parent) and some small branch offices falling under the purview of these main branch offices (say Child). The datalist A and B consists of these main branch office names as well as small branch office names. B is subset of A and these branch names are coded. Thus we have two datasets A and B as (again I am using only a portion of a large database just to have some idea) A B 144 what is here in B? Empty space?, 145 146 147 144 How do you know that 144 from B relates to 147 in A? Is it according to its positions? I.e. 4th item in B belongs to 4.th item in A? 148 145 149 147 151 148 Now the branch 144 appears in A as well as in B and in B it is mapped with 147. This means branch 147 comes under the purview of main branch 144. Again 147 is controlling the branch 149 (since 147 also has appeared in B and is mapped with 149 of A). Similarly, branch 145 is controlling branch 148 which further controls operations of bank branch 151 and like wise. Well as you did not say anything about structure of your data A-144:151 B-144:148 data.frame(A,B) A B 1 144 NA 2 145 NA 3 146 NA 4 147 144 5 148 145 6 149 146 7 150 147 8 151 148 DF-data.frame(A,B) main-DF$A[is.na(DF$B)] branch1-DF[!is.na(DF$B),] selected.branch1-branch1$A[branch1$B%in%main] branch2-branch1[!branch1$B%in%main,] selected.branch2-branch2$A[branch2$B%in%selected.branch1] and for cbinding your data which has uneven number of values see Jim Holtman's answer to this How to cbind DF:s with differing number of rows? Regards Petr So in the end I need an output something like - Main Branch Branch office1 Branch office2 144 147 149 145 148 151 146 NA NA ... .. I understand again I am not able to put forward my query properly. But I must thank all of you for giving a patient reading to my query and for reverting back earlier. Thanks once again. With warmest regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 6:39 Hi without other details it is probably impossible to give you any reasonable advice. Do you have your data already in R? What is their form? Are they in 2 columns in data frame? How did you get them paired? So without some more information probably nobody will invest his time as it seems no trivial to me. Regards Petr r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42: Dear R Helpers, I am a newbie and recently got introduced to R. I have a large database containing the names of bank branch offices along-with other details. I am into Operational Risk as envisaged by BASEL II Accord. I am trying to express my problem and I am using only an indicative data which comes in coded format. A (branch) B (controlled by) 144 145 146 147 144 148 145 149
Re: [R] Odp: Finding pairs
Hi well, I will add some explanation r-help-boun...@r-project.org napsal dne 25.08.2010 11:24:38: Dear Mr Petr PIKAL After reading the R code provided by you, I realized that I would have never figured out how this could have been done. I am going to re-read again and again your code to understand the logic and the commands you have provided. Thanks again from the heart for your kind advice. Regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Re: [R] Odp: Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 9:01 Hm r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26: Dear Mr Petr Pikal I am extremely sorry for the manner I have raised the query. Actually that was my first post to this R forum and in fact even I was also bit confused while drafting the query, for which I really owe sorry to all for consuming the precious time. Perhaps I will try to redraft my query in a better way as follows. I have two datasets A and B containing the names of branch offices of a particular bank say XYZ plc bank. The XYZ bank has number of main branch offices (say Parent) and some small branch offices falling under the purview of these main branch offices (say Child). The datalist A and B consists of these main branch office names as well as small branch office names. B is subset of A and these branch names are coded. Thus we have two datasets A and B as (again I am using only a portion of a large database just to have some idea) A B 144 what is here in B? Empty space?, 145 146 147 144 How do you know that 144 from B relates to 147 in A? Is it according to its positions? I.e. 4th item in B belongs to 4.th item in A? 148 145 149 147 151 148 Now the branch 144 appears in A as well as in B and in B it is mapped with 147. This means branch 147 comes under the purview of main branch 144. Again 147 is controlling the branch 149 (since 147 also has appeared in B and is mapped with 149 of A). Similarly, branch 145 is controlling branch 148 which further controls operations of bank branch 151 and like wise. Well as you did not say anything about structure of your data A-144:151 B-144:148 data.frame(A,B) A B 1 144 NA 2 145 NA 3 146 NA 4 147 144 5 148 145 6 149 146 7 150 147 8 151 148 DF-data.frame(A,B) This was just making a data frame with 2 columns to have some data to play with main-DF$A[is.na(DF$B)] Above are codes from A which are NA in B branch1-DF[!is.na(DF$B),] Above is data frame of remaining codes (other than main) selected.branch1-branch1$A[branch1$B%in%main] Above is codes from column A for which B column and main are the same branch2-branch1[!branch1$B%in%main,] This is the rest of yet not selected rows selected.branch2-branch2$A[branch2$B%in%selected.branch1] and this is selection of values from column A for which B column and selected.branch1 values are same. But it works for this particular data, I am not sure how it behaves with duplicates and further issues. It also depends on how your data is organised. And if you are in reading you could also go through setdiff, merge and maybe sqldf package and Rdata Import/export manual Regards Petr and for cbinding your data which has uneven number of values see Jim Holtman's answer to this How to cbind DF:s with differing number of rows? Regards Petr So in the end I need an output something like - Main Branch Branch office1 Branch office2 144 147 149 145 148 151 146 NA NA ... .. I understand again I am not able to put forward my query properly. But I must thank all of you for giving a patient reading to my query and for reverting back earlier. Thanks once again. With warmest regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 6:39 Hi without other details it is probably impossible to give you any reasonable advice. Do you
Re: [R] Odp: Finding pairs
Hi: I'm just ideating here (think IBM commercial...) but perhaps a graphical model approach might be worth looking into. It seems to me that Mr. Rhodes is looking for clusters of banks that are under the same ownership umbrella. That information is not directly available in a single variable, but can evidently be inferred from the matches between the two variables: B[i] controls A[i] if B[i] is nonempty. In the bank 144 - 147 - 149 example, 149 controls 147 and 147 controls 144, so it appears that some transitive relation holds among the set of matches as well. (Why is PacMan going through my head? :) I know next to nothing about graphical models, but I'm thinking about igraph and some of the tools in the statnet bundle to tackle this problem. Does that make sense to anyone? Alternatives? FWIW, Dennis On Wed, Aug 25, 2010 at 2:24 AM, Mike Rhodes mike_simpso...@yahoo.co.ukwrote: Dear Mr Petr PIKAL After reading the R code provided by you, I realized that I would have never figured out how this could have been done. I am going to re-read again and again your code to understand the logic and the commands you have provided. Thanks again from the heart for your kind advice. Regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Re: [R] Odp: Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 9:01 Hm r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26: Dear Mr Petr Pikal I am extremely sorry for the manner I have raised the query. Actually that was my first post to this R forum and in fact even I was also bit confused while drafting the query, for which I really owe sorry to all for consuming the precious time. Perhaps I will try to redraft my query in a better way as follows. I have two datasets A and B containing the names of branch offices of a particular bank say XYZ plc bank. The XYZ bank has number of main branch offices (say Parent) and some small branch offices falling under the purview of these main branch offices (say Child). The datalist A and B consists of these main branch office names as well as small branch office names. B is subset of A and these branch names are coded. Thus we have two datasets A and B as (again I am using only a portion of a large database just to have some idea) A B 144 what is here in B? Empty space?, 145 146 147 144 How do you know that 144 from B relates to 147 in A? Is it according to its positions? I.e. 4th item in B belongs to 4.th item in A? 148 145 149 147 151 148 Now the branch 144 appears in A as well as in B and in B it is mapped with 147. This means branch 147 comes under the purview of main branch 144. Again 147 is controlling the branch 149 (since 147 also has appeared in B and is mapped with 149 of A). Similarly, branch 145 is controlling branch 148 which further controls operations of bank branch 151 and like wise. Well as you did not say anything about structure of your data A-144:151 B-144:148 data.frame(A,B) A B 1 144 NA 2 145 NA 3 146 NA 4 147 144 5 148 145 6 149 146 7 150 147 8 151 148 DF-data.frame(A,B) main-DF$A[is.na(DF$B)] branch1-DF[!is.na(DF$B),] selected.branch1-branch1$A[branch1$B%in%main] branch2-branch1[!branch1$B%in%main,] selected.branch2-branch2$A[branch2$B%in%selected.branch1] and for cbinding your data which has uneven number of values see Jim Holtman's answer to this How to cbind DF:s with differing number of rows? Regards Petr So in the end I need an output something like - Main Branch Branch office1 Branch office2 144 147 149 145 148 151 146 NA NA ... .. I understand again I am not able to put forward my query properly. But I must thank all of you for giving a patient reading to my query and for reverting back earlier. Thanks once again. With warmest regards Mike --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] Finding pairs To: Mike Rhodes mike_simpso...@yahoo.co.uk Cc: r-help@r-project.org Date: Wednesday, 25 August, 2010, 6:39 Hi without other details it is probably impossible to give you any reasonable advice. Do you have your data already in R? What is their form? Are they in 2 columns in data frame? How did you get