I've exported a GMail archive in MBOX format using takeout.google.com. The MBOX archive also includes GChat messages. However, the GChat messages do not include a Date header. Instead the date sent is included in what appears to be a non-conforming RFC822 header which the tika mbox parser does not recognize. I'm wondering if anyone has any experience extracting metadata from Gmail exports, specifically gchat messages. Any help or guidance would be appreciated.
>From 1558692903658457318@xxx Tue Feb 07 16:36:29 +0000 2017 X-GM-THRID: 1558691798399711410 X-Gmail-Labels: Chat From: [REDACTED] MIME-Version: 1.0 Content-Type: text/plain
body of the chat message
