Hello,

If you want to select rows with just one IPC, use `==`.
If you want to select rows with several IPC's, use `%in%`.
See the code below for the two ways of doing this.


oecd <- read.table(text = "
Appln_id|Prio_Year|App_year|IPC
1|1999|2000|H04Q007/32
1|1999|2000|G06K019/077
1|1999|2000|H01R012/18
1|1999|2000|G06K017/00
1|1999|2000|H04M001/2745
1|1999|2000|G06K007/00
1|1999|2000|H04M001/02
1|1999|2000|H04M001/275
2|1991|1992|C12N015/62
2|1991|1992|C12N015/09
2|1991|1992|C07K019/00
2|1991|1992|C07K016/26
", header = TRUE, sep = "|")


select_one <- "H04Q007/32"
select_many <- c("H04Q007/32", "H04M001/275")

oecd2 <- subset(oecd, IPC == select_one)
oecd3 <- subset(oecd, IPC %in% select_many)


Hope this helps,

Rui Barradas

On 1/3/2018 7:53 PM, Saptorshee Kanto Chakraborty wrote:
Hello,

I have a data of Patents from OECD in delimited text format with IPC being
one column, I want to filter the data by selecting only certain IPC in that
column and delete other rows which do not have my required IPCs. Please,
can anybody guide me doing it, also the IPC codes are string variables.

The data is somewhat like below, but its a huge dataset containing more
than 11 million rows


Appln_id|Prio_Year|App_year|IPC
1|1999|2000|H04Q007/32
1|1999|2000|G06K019/077
1|1999|2000|H01R012/18
1|1999|2000|G06K017/00
1|1999|2000|H04M001/2745
1|1999|2000|G06K007/00
1|1999|2000|H04M001/02
1|1999|2000|H04M001/275
2|1991|1992|C12N015/62
2|1991|1992|C12N015/09
2|1991|1992|C07K019/00
2|1991|1992|C07K016/26



Thanking You

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to