This may be of use: https://www.nlm.nih.gov/research/umls/rxnorm/overview.html
On Mon, May 20, 2019 at 3:19 PM Peter Abramowitsch <[email protected]> wrote: > I used to work for a division of Hearst that also owns the company First > Databank. They have an electronic compendium of information about every > drug where you can find out its generic and proprietary forms, its primary > ingredient(s), its therapeutic class, forms, dosages, side effects, disease > indications etc etc. Much of this you can now get from RXNorm, I think. > The subscription fee for FDB is pretty high but the information is very > well curated. > > Peter > > On Mon, May 20, 2019 at 5:02 PM Hari, Sekhar <[email protected]> wrote: > > > Hi - > > > > My question is a little different, and I'm OK if there is a way to solve > > this puzzle either through cTAKES, OR, through UMLS lookups, OR, through > > lookups in other published databases. At this time, I really don't know > if > > this can be solved through Machine Learning algorithms. > > > > Problem: > > I've been asked to find out if the following is possible: > > "Given a pharma regulatory document (say a searchable PDF document) > > related to drug(s), predict the corresponding 'Primary Compound ID'. > > > > The format of a primary compound ID could be - <<pharma company > > name>>-<<numeric digits>>-<<three or two letters abbreviation>>. > > > > To make the scenario easier, I'll consider the following case: > > Primary Compound ID: CNTO148. > > This is a deviation to the above format. If we split this ID, it would > > represent CNTO as the pharma company (Centocor Biotech, Inc). I don't > know > > what the number 148 represent. > > > > However, CNTO148 is the pre-marketing name given during clinical trial > > phases. It's actual trademark is "SIMPONI" and the International > > Non-proprietary name (INN) is "Golimumab". The condition mentioned for > this > > drug is 'Rheumatoid Arthritis' > > > > Question: > > Using cTAKES if I could identify the product as "SIMPONI" and the > > indication as 'Rheumatoid Arthritis', is there a way to identify or > derive > > its 'Primary Compound ID' - in this case CNTO148 - (or sometimes called > as > > 'Controlling Product') through some mechanism? > > > > My analysis: > > If I query the ClinicalTrials.gov data using the drug name, I'm able to > > find the corresponding 'Primary Compound ID' that was used during > clinical > > study. But this ID is not available for all drug products from > > ClinicalTrials.gov database. I'm looking at a consistent way to derive > the > > 'Primary Compound ID' if these IDs are registered anywhere. > > > > Other questions: > > What meaning does the abbreviations used in 'Primary Compound ID' contain > > (three or two letters abbreviation in the format defined above)? > > Some example abbreviations (there are many more): > > > > * AAB > > > > * AC > > > > * AN > > > > * AAA > > > > * AAC > > > > * AMK > > > > * ZBR > > > > * AER > > > > * AEN > > > > Is there a vocabulary where these are listed that I could study? > > > > Thanks > > Sekhar Hari | AI Program Lead | Health Sciences R&D | Asia Pacific > > Solutions Delivery Center > > +91 814 7027 779 (C) > > >
