I used to work for a division of Hearst that also owns the company First
Databank.  They have an electronic compendium of information about every
drug where you can find out its generic and proprietary forms, its primary
ingredient(s), its therapeutic class, forms, dosages, side effects, disease
indications etc etc.  Much of this you can now get from RXNorm, I think.
The subscription fee for FDB is pretty high but the information is very
well curated.

Peter

On Mon, May 20, 2019 at 5:02 PM Hari, Sekhar <sekhar.h...@cgi.com> wrote:

> Hi -
>
> My question is a little different, and I'm OK if there is a way to solve
> this puzzle either through cTAKES, OR, through UMLS lookups, OR, through
> lookups in other published databases. At this time, I really don't know if
> this can be solved through Machine Learning algorithms.
>
> Problem:
> I've been asked to find out if the following is possible:
> "Given a pharma regulatory document (say a searchable PDF document)
> related to drug(s), predict the corresponding 'Primary Compound ID'.
>
> The format of a primary compound ID could be - <<pharma company
> name>>-<<numeric digits>>-<<three or two letters abbreviation>>.
>
> To make the scenario easier, I'll consider the following case:
> Primary Compound ID: CNTO148.
> This is a deviation to the above format. If we split this ID, it would
> represent CNTO as the pharma company (Centocor Biotech, Inc). I don't know
> what the number 148 represent.
>
> However, CNTO148 is the pre-marketing name given during clinical trial
> phases. It's actual trademark is "SIMPONI" and the International
> Non-proprietary name (INN) is "Golimumab". The condition mentioned for this
> drug is 'Rheumatoid Arthritis'
>
> Question:
> Using cTAKES if I could identify the product as "SIMPONI" and the
> indication as 'Rheumatoid Arthritis', is there a way to identify or derive
> its 'Primary Compound ID' - in this case CNTO148 - (or sometimes called as
> 'Controlling Product') through some mechanism?
>
> My analysis:
> If I query the ClinicalTrials.gov data using the drug name, I'm able to
> find the corresponding 'Primary Compound ID' that was used during clinical
> study. But this ID is not available for all drug products from
> ClinicalTrials.gov database. I'm looking at a consistent way to derive the
> 'Primary Compound ID' if these IDs are registered anywhere.
>
> Other questions:
> What meaning does the abbreviations used in 'Primary Compound ID' contain
> (three or two letters abbreviation in the format defined above)?
> Some example abbreviations (there are many more):
>
> *         AAB
>
> *         AC
>
> *         AN
>
> *         AAA
>
> *         AAC
>
> *         AMK
>
> *         ZBR
>
> *         AER
>
> *         AEN
>
> Is there a vocabulary where these are listed that I could study?
>
> Thanks
> Sekhar Hari | AI Program Lead | Health Sciences R&D | Asia Pacific
> Solutions Delivery Center
> +91 814 7027 779 (C)
>

Reply via email to