[R] Odp: Finding pairs

2010-08-25 Thread Petr PIKAL
Hi

without other details it is probably impossible to give you any reasonable 
advice. Do you have your data already in R? What is their form? Are they 
in 2 columns in data frame? How did you get them paired?

So without some more information probably nobody will invest his time as 
it seems no trivial to me.

Regards
Petr

r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42:

 
 
 
 
 Dear R Helpers,
 
 
 I am a newbie and recently got introduced to R. I have a large database 
 containing the names of bank branch offices along-with other details. I 
am 
 into Operational Risk as envisaged by BASEL II Accord. 
 
 
 I am trying to express my problem and I am using only an indicative data 
which
 comes in coded format.
 
 
 
 
 A (branch)  B (controlled by)
 
 
 144   
 145  
 146   
 147   144 
 148   145 
 149   147
 151   146  
  ..  ...
  
 ..  ...
 
 
 where 144's etc are branch codes in a given city and B is subset of A.
 
 
 
 
 If a branch code appearing in A also appears in B (which is paired 
with 
 some otehr element of A e.g. 144 appearing in A, also appears in B and 
is 
 paired with 147 of A and likewise), then that means 144 is controlling 

 operations of bank office 147. Again, 147 itself appears again in B and 
is 
 paired with bank branch coded 149. Thus, 149 is controlled by 147 and 
147 is 
 controlled by 144. Likewise there are more than 700 hundred branch name 
codes available.
 
 
 My objective is to group them as follows -
 
 
 Bank Branch
 
 
 144  147149 
 
 
 145
 
 
 146   151  
 
 
 148
 .
 
 
 or even the following output will do.
 
 
 144
 147
 149
 
 
 145
 
 
 146
 151
 
 
 148
 151
 ..
 
 
 I understand I should be writing some R code to begin with which I had 
tried 
 also but as of now I am helpless. Please guide me.
 
 
 Mike
 
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Finding pairs

2010-08-25 Thread Mike Rhodes
Dear Mr Petr Pikal

I am extremely sorry for the manner I have raised the query. Actually that was 
my first post to this R forum and in fact even I was also bit confused while 
drafting the query, for which I really owe sorry to all for consuming the 
precious time. Perhaps I will try to redraft my query in a better way as 
follows. 

I have two datasets A and B containing the names of branch offices of a 
particular bank say XYZ plc bank. The XYZ bank has number of main branch 
offices (say Parent) and some small branch offices falling under the purview of 
these main branch offices (say Child).

The datalist A and B consists of these main branch office names as well as 
small branch office names. B is subset of A and these branch names are coded. 
Thus we have two datasets A and B as (again I am using only a
 portion of a large database just to have some idea)


A B
144  
145   
146   
147                  144    
148  145  
 
149  147
151  148



Now the branch 144 appears in A as well as in B and in B it is mapped with 147. 
This means branch 147 comes under the purview of main branch 144. Again 147 is 
controlling the branch 149 (since 147 also has appeared in B and is mapped with 
149 of A).

Similarly, branch 145 is controlling branch 148 which further controls 
operations of bank branch 151 and like wise.

So in the end I need an output something like -

Main Branch   Branch office1 Branch
 office2
144 147 149
145 148 151    
146 NA
  NA   
...
..

 
I understand again I am not able to put forward my query properly. But I must 
thank all of you for giving a patient reading to my query and for reverting 
back earlier. Thanks once again.

With warmest regards

Mike 


--- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:

From: Petr PIKAL petr.pi...@precheza.cz
Subject: Odp: [R] Finding
 pairs
To: Mike Rhodes mike_simpso...@yahoo.co.uk
Cc: r-help@r-project.org
Date: Wednesday, 25 August, 2010, 6:39

Hi

without other details it is probably impossible to give you any reasonable 
advice. Do you have your data already in R? What is their form? Are they 
in 2 columns in data frame? How did you get them paired?

So without some more information probably nobody will invest his time as 
it seems no trivial to me.

Regards
Petr

r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42:

 
 
 
 
 Dear R Helpers,
 
 
 I am a newbie and recently got introduced to R. I have a large database 
 containing the names of bank branch offices along-with other details. I 
am 
 into Operational
 Risk as envisaged by BASEL II Accord. 
 
 
 I am trying to express my problem and I am using only an indicative data 
which
 comes in coded format.
 
 
 
 
 A (branch)                      B (controlled by)
 
 
 144                   
 145                      
 146                   
 147                                       144 
 148                                       145 
 149       
                                147
 151                                       146  
  ..                                      ...
  
 ..                                      ...
 
 
 where 144's etc are branch codes in a given city and B is subset of A.
 
 
 
 
 If a branch code appearing in A also appears in B (which is paired 
with 
 some otehr element of A e.g. 144 appearing in A, also appears in B and 
is 
 paired with 147 of A and
 likewise), then that means 144 is controlling 

 operations of bank office 147. Again, 147 itself appears again in B and 
is 
 paired with bank branch coded 149. Thus, 149 is controlled by 147 and 
147 is 
 controlled by 144. Likewise there are more than 700 hundred branch name 
codes available.
 
 
 My objective is to group them as follows -
 
 
 Bank Branch
 
 
 144      147    149 
 
 
 145
 
 
 146       151  
 
 
 148
 .
 
 
 or even the following output will do.
 
 
 144
 147
 149
 
 
 145
 
 
 146
 151
 
 
 148
 151
 ..
 
 
 I understand I should be writing some R
 code to begin with which I had 
tried 
 also but as of now I am helpless. Please guide me.
 
 
 Mike
 
 
 
 
 
    [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]


Re: [R] Odp: Finding pairs

2010-08-25 Thread Petr PIKAL
Hm

r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26:

 Dear Mr Petr Pikal
 
 I am extremely sorry for the manner I have raised the query. Actually 
that was
 my first post to this R forum and in fact even I was also bit confused 
while 
 drafting the query, for which I really owe sorry to all for consuming 
the 
 precious time. Perhaps I will try to redraft my query in a better way as 
follows. 
 
 I have two datasets A and B containing the names of branch offices 
of a 
 particular bank say XYZ plc bank. The XYZ bank has number of main branch 

 offices (say Parent) and some small branch offices falling under the 
purview 
 of these main branch offices (say Child).
 
 The datalist A and B consists of these main branch office names as 
well as
 small branch office names. B is subset of A and these branch names are 
coded. 
 Thus we have two datasets A and B as (again I am using only a
  portion of a large database just to have some idea)
 
 
 A B
 144  
   what is here in B? Empty space?, 
 145   
 146   
 147  144

How do you know that 144 from B relates to 147 in A? Is it according to 
its positions? I.e. 4th item in B belongs to 4.th item in A?

 148  145  
 
 149  147
 151  148
 
 
 
 Now the branch 144 appears in A as well as in B and in B it is mapped 
with 
 147. This means branch 147 comes under the purview of main branch 144. 
Again 
 147 is controlling the branch 149 (since 147 also has appeared in B and 
is 
 mapped with 149 of A).
 
 Similarly, branch 145 is controlling branch 148 which further controls 
 operations of bank branch 151 and like wise.

Well as you did not say anything about structure of your data
A-144:151
B-144:148
data.frame(A,B)
A   B
1 144  NA
2 145  NA
3 146  NA
4 147 144
5 148 145
6 149 146
7 150 147
8 151 148
DF-data.frame(A,B)
main-DF$A[is.na(DF$B)]
branch1-DF[!is.na(DF$B),]
selected.branch1-branch1$A[branch1$B%in%main]
branch2-branch1[!branch1$B%in%main,]
selected.branch2-branch2$A[branch2$B%in%selected.branch1]

and for cbinding your data which has uneven number of values see Jim 
Holtman's answer to this

How to cbind DF:s with differing number of rows?

Regards
Petr


 
 So in the end I need an output something like -
 
 Main Branch   Branch office1 Branch
  office2
 144 147 149
 145 148 151 
   
 146 NA
   NA   
 
...
 
..
 
  
 I understand again I am not able to put forward my query properly. But I 
must 
 thank all of you for giving a patient reading to my query and for 
reverting 
 back earlier. Thanks once again.
 
 With warmest regards
 
 Mike 
 
 
 --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:
 
 From: Petr PIKAL petr.pi...@precheza.cz
 Subject: Odp: [R] Finding
  pairs
 To: Mike Rhodes mike_simpso...@yahoo.co.uk
 Cc: r-help@r-project.org
 Date: Wednesday, 25 August, 2010, 6:39
 
 Hi
 
 without other details it is probably impossible to give you any 
reasonable 
 advice. Do you have your data already in R? What is their form? Are they 

 in 2 columns in data frame? How did you get them paired?
 
 So without some more information probably nobody will invest his time as 

 it seems no trivial to me.
 
 Regards
 Petr
 
 r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42:
 
  
  
  
  
  Dear R Helpers,
  
  
  I am a newbie and recently got introduced to R. I have a large 
database 
  containing the names of bank branch offices along-with other details. 
I 
 am 
  into Operational
  Risk as envisaged by BASEL II Accord. 
  
  
  I am trying to express my problem and I am using only an indicative 
data 
 which
  comes in coded format.
  
  
  
  
  A (branch)  B (controlled by)
  
  
  144   
  145  
  146   
  147   144 
  148   145 
  149   
 147
  151   146  
   ..  ...
   
  ..  ...
  
  
  where 144's etc are branch codes in a given city and B is subset of A.
  
  
  
  
  If a branch code appearing in A also appears in B (which is paired 

 with 
  some otehr element of A e.g. 144 appearing in A, also appears in B 
and 
 is 
  paired with 147 of A and
  likewise), then that means 144 is controlling 
 
  operations of bank office 147. Again, 147 itself appears again 

Re: [R] Odp: Finding pairs

2010-08-25 Thread Mike Rhodes
Dear Mr Petr PIKAL
After reading the R code provided by you, I realized that I would have never 
figured out how this could have been done. I am going to re-read again and 
again your code to understand the logic and the commands you have provided.
Thanks again from the heart for your kind advice.
Regards
Mike

--- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:

From: Petr PIKAL petr.pi...@precheza.cz
Subject: Re: [R] Odp:  Finding pairs
To: Mike Rhodes mike_simpso...@yahoo.co.uk
Cc: r-help@r-project.org
Date: Wednesday, 25 August, 2010, 9:01

Hm

r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26:

 Dear Mr Petr Pikal
 
 I am extremely sorry for the manner I have raised the query. Actually 
that was
 my first post to this R forum and in fact even I was also bit confused 
while 
 drafting the query, for which I really owe sorry to all for consuming 
the 
 precious time. Perhaps I will try to redraft my query in a better way as 
follows. 
 
 I have two datasets A and B containing the names of branch offices 
of a 
 particular bank say XYZ plc bank. The XYZ bank has number of main branch 

 offices (say Parent) and some small branch offices falling under the 
purview 
 of these main branch offices (say Child).
 
 The datalist A and B consists of these main branch office names as 
well as
 small branch office names. B is subset of A and these branch names are 
coded. 
 Thus we have two datasets A and B as (again I am using only a
  portion of a large database just to have some idea)
 
 
 A                         B
 144                      
                       what is here in B? Empty space?, 
 145                       
 146                       
 147                  144                        

How do you know that 144 from B relates to 147 in A? Is it according to 
its positions? I.e. 4th item in B belongs to 4.th item in A?

 148                  145  
 
 149                  147
 151                  148
 
 
 
 Now the branch 144 appears in A as well as in B and in B it is mapped 
with 
 147. This means branch 147 comes under the purview of main branch 144. 
Again 
 147 is controlling the branch 149 (since 147 also has appeared in B and 
is 
 mapped with 149 of A).
 
 Similarly, branch 145 is controlling branch 148 which further controls 
 operations of bank branch 151 and like wise.

Well as you did not say anything about structure of your data
A-144:151
B-144:148
data.frame(A,B)
    A   B
1 144  NA
2 145  NA
3 146  NA
4 147 144
5 148 145
6 149 146
7 150 147
8 151 148
DF-data.frame(A,B)
main-DF$A[is.na(DF$B)]
branch1-DF[!is.na(DF$B),]
selected.branch1-branch1$A[branch1$B%in%main]
branch2-branch1[!branch1$B%in%main,]
selected.branch2-branch2$A[branch2$B%in%selected.branch1]

and for cbinding your data which has uneven number of values see Jim 
Holtman's answer to this

How to cbind DF:s with differing number of rows?

Regards
Petr


 
 So in the end I need an output something like -
 
 Main Branch           Branch office1                 Branch
  office2
 144                             147                                 149
 145                             148                                 151 
   
 146                             NA
                                   NA               
 
...
 
..
 
  
 I understand again I am not able to put forward my query properly. But I 
must 
 thank all of you for giving a patient reading to my query and for 
reverting 
 back earlier. Thanks once again.
 
 With warmest regards
 
 Mike 
 
 
 --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:
 
 From: Petr PIKAL petr.pi...@precheza.cz
 Subject: Odp: [R] Finding
  pairs
 To: Mike Rhodes mike_simpso...@yahoo.co.uk
 Cc: r-help@r-project.org
 Date: Wednesday, 25 August, 2010, 6:39
 
 Hi
 
 without other details it is probably impossible to give you any 
reasonable 
 advice. Do you have your data already in R? What is their form? Are they 

 in 2 columns in data frame? How did you get them paired?
 
 So without some more information probably nobody will invest his time as 

 it seems no trivial to me.
 
 Regards
 Petr
 
 r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42:
 
  
  
  
  
  Dear R Helpers,
  
  
  I am a newbie and recently got introduced to R. I have a large 
database 
  containing the names of bank branch offices along-with other details. 
I 
 am 
  into Operational
  Risk as envisaged by BASEL II Accord. 
  
  
  I am trying to express my problem and I am using only an indicative 
data 
 which
  comes in coded format.
  
  
  
  
  A (branch)                      B (controlled by)
  
  
  144                   
  145                      
  146                   
  147                                       144 
  148                                       145 
  149

Re: [R] Odp: Finding pairs

2010-08-25 Thread Petr PIKAL
Hi

well, I will add some explanation

r-help-boun...@r-project.org napsal dne 25.08.2010 11:24:38:

 Dear Mr Petr PIKAL
 After reading the R code provided by you, I realized that I would have 
never 
 figured out how this could have been done. I am going to re-read again 
and 
 again your code to understand the logic and the commands you have 
provided.
 Thanks again from the heart for your kind advice.
 Regards
 Mike
 
 --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:
 
 From: Petr PIKAL petr.pi...@precheza.cz
 Subject: Re: [R] Odp:  Finding pairs
 To: Mike Rhodes mike_simpso...@yahoo.co.uk
 Cc: r-help@r-project.org
 Date: Wednesday, 25 August, 2010, 9:01
 
 Hm
 
 r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26:
 
  Dear Mr Petr Pikal
  
  I am extremely sorry for the manner I have raised the query. Actually 
 that was
  my first post to this R forum and in fact even I was also bit confused 

 while 
  drafting the query, for which I really owe sorry to all for consuming 
 the 
  precious time. Perhaps I will try to redraft my query in a better way 
as 
 follows. 
  
  I have two datasets A and B containing the names of branch offices 

 of a 
  particular bank say XYZ plc bank. The XYZ bank has number of main 
branch 
 
  offices (say Parent) and some small branch offices falling under the 
 purview 
  of these main branch offices (say Child).
  
  The datalist A and B consists of these main branch office names as 

 well as
  small branch office names. B is subset of A and these branch names are 

 coded. 
  Thus we have two datasets A and B as (again I am using only a
   portion of a large database just to have some idea)
  
  
  A B
  144  
what is here in B? Empty space?, 
  145   
  146   
  147  144
 
 How do you know that 144 from B relates to 147 in A? Is it according to 
 its positions? I.e. 4th item in B belongs to 4.th item in A?
 
  148  145  
  
  149  147
  151  148
  
  
  
  Now the branch 144 appears in A as well as in B and in B it is mapped 
 with 
  147. This means branch 147 comes under the purview of main branch 144. 

 Again 
  147 is controlling the branch 149 (since 147 also has appeared in B 
and 
 is 
  mapped with 149 of A).
  
  Similarly, branch 145 is controlling branch 148 which further controls 

  operations of bank branch 151 and like wise.
 
 Well as you did not say anything about structure of your data
 A-144:151
 B-144:148
 data.frame(A,B)
 A   B
 1 144  NA
 2 145  NA
 3 146  NA
 4 147 144
 5 148 145
 6 149 146
 7 150 147
 8 151 148
 DF-data.frame(A,B)

This was just making a data frame with 2 columns to have some data to play 
with

 main-DF$A[is.na(DF$B)]

Above are codes from A which are NA in B

 branch1-DF[!is.na(DF$B),]

Above is data frame of remaining codes (other than main)

 selected.branch1-branch1$A[branch1$B%in%main]

Above is codes from column A for which B column and main are the same

 branch2-branch1[!branch1$B%in%main,]

This is the rest of yet not selected rows

 selected.branch2-branch2$A[branch2$B%in%selected.branch1]

and this is selection of values from column A for which B column and 
selected.branch1 values are same.

But it works for this particular data, I am not sure how it behaves with 
duplicates and further issues. It also depends on how your data is 
organised.

And if you are in reading you could also go through setdiff, merge and 
maybe sqldf package and Rdata Import/export manual 

Regards
Petr


 
 and for cbinding your data which has uneven number of values see Jim 
 Holtman's answer to this
 
 How to cbind DF:s with differing number of rows?
 
 Regards
 Petr
 
 
  
  So in the end I need an output something like -
  
  Main Branch   Branch office1 Branch
   office2
  144 147  
   149
  145 148  
   151 

  146 NA
NA   
  
 
...
  
 
..
  
   
  I understand again I am not able to put forward my query properly. But 
I 
 must 
  thank all of you for giving a patient reading to my query and for 
 reverting 
  back earlier. Thanks once again.
  
  With warmest regards
  
  Mike 
  
  
  --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:
  
  From: Petr PIKAL petr.pi...@precheza.cz
  Subject: Odp: [R] Finding
   pairs
  To: Mike Rhodes mike_simpso...@yahoo.co.uk
  Cc: r-help@r-project.org
  Date: Wednesday, 25 August, 2010, 6:39
  
  Hi
  
  without other details it is probably impossible to give you any 
 reasonable 
  advice. Do you

Re: [R] Odp: Finding pairs

2010-08-25 Thread Dennis Murphy
Hi:

I'm just ideating here (think IBM commercial...) but perhaps a graphical
model approach might be worth looking into. It seems to me that Mr. Rhodes
is looking for clusters of banks that are under the same ownership umbrella.
That information is not directly available in a single variable, but can
evidently be inferred from the matches between the two variables: B[i]
controls A[i] if B[i] is nonempty. In the bank 144 - 147 - 149 example,
149 controls 147 and 147 controls 144, so it appears that some transitive
relation holds among the set of matches as well. (Why is PacMan going
through my head? :)  I know next to nothing about graphical models, but I'm
thinking about igraph and some of the tools in the statnet bundle to tackle
this problem. Does that make sense to anyone? Alternatives?

FWIW,
Dennis

On Wed, Aug 25, 2010 at 2:24 AM, Mike Rhodes mike_simpso...@yahoo.co.ukwrote:

 Dear Mr Petr PIKAL
 After reading the R code provided by you, I realized that I would have
 never figured out how this could have been done. I am going to re-read again
 and again your code to understand the logic and the commands you have
 provided.
 Thanks again from the heart for your kind advice.
 Regards
 Mike

 --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:

 From: Petr PIKAL petr.pi...@precheza.cz
 Subject: Re: [R] Odp:  Finding pairs
 To: Mike Rhodes mike_simpso...@yahoo.co.uk
 Cc: r-help@r-project.org
 Date: Wednesday, 25 August, 2010, 9:01

 Hm

 r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26:

  Dear Mr Petr Pikal
 
  I am extremely sorry for the manner I have raised the query. Actually
 that was
  my first post to this R forum and in fact even I was also bit confused
 while
  drafting the query, for which I really owe sorry to all for consuming
 the
  precious time. Perhaps I will try to redraft my query in a better way as
 follows.
 
  I have two datasets A and B containing the names of branch offices
 of a
  particular bank say XYZ plc bank. The XYZ bank has number of main branch

  offices (say Parent) and some small branch offices falling under the
 purview
  of these main branch offices (say Child).
 
  The datalist A and B consists of these main branch office names as
 well as
  small branch office names. B is subset of A and these branch names are
 coded.
  Thus we have two datasets A and B as (again I am using only a
   portion of a large database just to have some idea)
 
 
  A B
  144
what is here in B? Empty space?,
  145
  146
  147  144

 How do you know that 144 from B relates to 147 in A? Is it according to
 its positions? I.e. 4th item in B belongs to 4.th item in A?

  148  145
 
  149  147
  151  148
 
 
 
  Now the branch 144 appears in A as well as in B and in B it is mapped
 with
  147. This means branch 147 comes under the purview of main branch 144.
 Again
  147 is controlling the branch 149 (since 147 also has appeared in B and
 is
  mapped with 149 of A).
 
  Similarly, branch 145 is controlling branch 148 which further controls
  operations of bank branch 151 and like wise.

 Well as you did not say anything about structure of your data
 A-144:151
 B-144:148
 data.frame(A,B)
 A   B
 1 144  NA
 2 145  NA
 3 146  NA
 4 147 144
 5 148 145
 6 149 146
 7 150 147
 8 151 148
 DF-data.frame(A,B)
 main-DF$A[is.na(DF$B)]
 branch1-DF[!is.na(DF$B),]
 selected.branch1-branch1$A[branch1$B%in%main]
 branch2-branch1[!branch1$B%in%main,]
 selected.branch2-branch2$A[branch2$B%in%selected.branch1]

 and for cbinding your data which has uneven number of values see Jim
 Holtman's answer to this

 How to cbind DF:s with differing number of rows?

 Regards
 Petr


 
  So in the end I need an output something like -
 
  Main Branch   Branch office1 Branch
   office2
  144 147 149
  145 148 151

  146 NA
NA
 

 ...
 

 ..
 
 
  I understand again I am not able to put forward my query properly. But I
 must
  thank all of you for giving a patient reading to my query and for
 reverting
  back earlier. Thanks once again.
 
  With warmest regards
 
  Mike
 
 
  --- On Wed, 25/8/10, Petr PIKAL petr.pi...@precheza.cz wrote:
 
  From: Petr PIKAL petr.pi...@precheza.cz
  Subject: Odp: [R] Finding
   pairs
  To: Mike Rhodes mike_simpso...@yahoo.co.uk
  Cc: r-help@r-project.org
  Date: Wednesday, 25 August, 2010, 6:39
 
  Hi
 
  without other details it is probably impossible to give you any
 reasonable
  advice. Do you have your data already in R? What is their form? Are they

  in 2 columns in data frame? How did you get