I think you want

x <- read_xml('<?xml version="1.0" ?>
  <WorkSet xmlns="http://labkey.org/etl/xml";>
  <Description>MFIA 9-Plex (CharlesRiver)</Description>
</WorkSet>')

The collapse argument do what you think it does.

Hadley

On Tue, Jan 31, 2017 at 5:36 PM, Mark Sharp <msh...@txbiomed.org> wrote:
> Hadley,
>
> Thank you. I am able to get the xml_ns_strip() function to work with my file 
> directly so I will likely be able to reach my immediate goal.
>
> However, I still have had no success with understanding the namespace 
> problem. I am not able to use read_xml() using the object I generated for the 
> reproducible example, which is simply a character vector of length 4 having 
> the contents of the XML file as produce by readLines(). I then used dput() to 
> define the structure. The resulting structure apparently is not to the liking 
> of read_xml(). I have reproduced the necessary code here for your 
> convenience. There error is below.
>
> ##
> library(xml2)
> library(stringr)
> with_ns_xml <- c("<?xml version=\"1.0\" ?>",
>                  "<WorkSet xmlns=\"http://labkey.org/etl/xml\";>",
>                  "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>                  "</WorkSet>")
> ## without str_c() collapse it complain of a vector of length > 1 also.
> read_xml(str_c(with_ns_xml, collapse = TRUE))
> Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = 
> as_html,  :
>   Start tag expected, '<' not found [4]
>
> ## produces the following error message.
> Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = 
> as_html,  :
>   Start tag expected, '<' not found [4]
>
> I have similar issues with xml2::xml_find_all
> xml_find_all(str_c(with_ns_xml, collapse = TRUE), "/WorkSet//Description")
>
> ## Produces the following error message.
> Error in UseMethod("xml_find_all") :
>   no applicable method for 'xml_find_all' applied to an object of class 
> "character"
>
>
>
> R. Mark Sharp, Ph.D.
> msh...@txbiomed.org
>
>
>
>
>
>> On Jan 31, 2017, at 4:27 PM, Hadley Wickham <h.wick...@gmail.com> wrote:
>>
>> See the last example in ?xml2::xml_find_all or use xml2::xml2::xml_ns_strip()
>>
>> Hadley
>>
>> On Tue, Jan 31, 2017 at 9:43 AM, Mark Sharp <msh...@txbiomed.org> wrote:
>>> I am trying to read a series of XML files that use a namespace and I have 
>>> failed, thus far, to discover the proper syntax. I have a reproducible 
>>> example below. I have two XML character strings defined: one without a 
>>> namespace and one with. I show that I can successfully extract the node 
>>> using the XML string without the namespace and fail when using the XML 
>>> string with the namespace.
>>>
>>> Mark
>>> PS I am having the same problem with the xml2 package and am hoping 
>>> understanding one with help with the other.
>>>
>>> ##
>>> library(XML)
>>> ## The first XML text (no_ns_xml) does not have a namespace defined
>>> no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>",
>>>               "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>>>               "</WorkSet>")
>>> l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE,
>>>                           useInternalNodes = TRUE)
>>> ## The node is found
>>> getNodeSet(l_no_ns_xml, "/WorkSet//Description")
>>>
>>> ## The second XML text (with_ns_xml) has a namespace defined
>>> with_ns_xml <- c("<?xml version=\"1.0\" ?>",
>>>                 "<WorkSet xmlns=\"http://labkey.org/etl/xml\";>",
>>>                 "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>>>                 "</WorkSet>")
>>>
>>> l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE,
>>>                               useInternalNodes = TRUE)
>>> ## The node is not found
>>> getNodeSet(l_with_ns_xml, "/WorkSet//Description")
>>> ## I attempt to provide the namespace, but fail.
>>> ns <-  "http://labkey.org/etl/xml";
>>> names(ns)[1] <- "xmlns"
>>> getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns)
>>>
>>> R. Mark Sharp, Ph.D.
>>> Director of Data Science Core
>>> Southwest National Primate Research Center
>>> Texas Biomedical Research Institute
>>> P.O. Box 760549
>>> San Antonio, TX 78245-0549
>>> Telephone: (210)258-9476
>>> e-mail: msh...@txbiomed.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> http://hadley.nz
>
> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments 
> transmitted, may contain privileged and confidential information and is 
> intended solely for the exclusive use of the individual or entity to whom it 
> is addressed. If you are not the intended recipient, you are hereby notified 
> that any review, dissemination, distribution or copying of this e-mail and/or 
> attachments is strictly prohibited. If you have received this e-mail in 
> error, please immediately notify the sender stating that this transmission 
> was misdirected; return the e-mail to sender; destroy all paper copies and 
> delete all electronic copies from your system without disclosing its contents.



-- 
http://hadley.nz

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to