Hi David,

I am hesitant to draw your (and others) attention to this because the json 
metadata collection features in ClamAV are very much a work in progress and 
most likely won't help in your situation. The output is not easy to work with 
and we intend to improve it in future versions. I also will note that libjson-c 
is not built into our Windows version of ClamAV and so this feature is not 
enabled on Windows.

On linux/unix systems where libjson-c has been built into clamav (configured 
with --with-json=</path/to/json-c/install>), you can run `clamscan --gen-json 
--leave-temps --tempdir=</path/to/output/directory> TARGETFILE`.  This will 
result in embedded buffers and files being written to the provided tempdir 
output directory.  In addition, metadata about the scan target file, will be 
written in a JSON format to the tempdir.  The name of this file, at present, is 
randomly generated as are most of the other files written to that directory.  
Basically you need to search for a file starting with ` { "Magic": 
"CLAMJSONv0", `.  This will be the JSON metadata file.  

For PDFs containing JS/Javascript or OpenActions (and some other types of 
metadata), you'll find the count of each occurrence in the PDFStats dictionary 
in this JSON.

Example output:
{
    "Magic": "CLAMJSONv0",
    "RootFileType": "CL_TYPE_PDF",
    "FileType": "CL_TYPE_PDF",
    "FileSize": 21272,
    "FileMD5": "", // sanitized
    "PDFStats": {
        "PDFVersion": "1.4",
        "JavascriptObjects": [
            31,
            32
        ],
        "IncorrectPagesCount": true,
        "ObjectsWithoutDictionaries": [
            4,
            5
        ],
        "Author": "", // sanitized
        "Producer": "", // sanitized
        "JavaScriptObjectCount": 2,
        "DeflateObjectCount": 8,
        "ImageCount": 6,
        "StandardCount": 1,
        "OpenActionCount": 1,
        "PageCount": 1
    },
    "ContainedObjects": [
        {
            "FileType": "CL_TYPE_TEXT_UTF16BE",
            "FileSize": 135,
            "FileMD5": "" // sanitized
        },
        {
            "FileType": "CL_TYPE_TEXT_ASCII",
            "FileSize": 456,
            "FileMD5": "" // sanitized
        },
        {
            "FileType": "CL_TYPE_BINARY_DATA",
            "FileSize": 3049,
            "FileMD5": "" // sanitized
        }
    ]
}

I realize this doesn't help much in your situation.  Your best option may be to 
use a signature like Arnaud described. In the future though, I would like to 
make it easier to generate this information, and find this information.  And of 
course I would like to make it work on Windows.  I will also note that this 
won't create metadata for files embedded in other files in the present version 
of ClamAV.  I also hope to make this JSON metadata feature work recursively and 
make the tempdir easier to navigate.  Someday!

Regards,
Micah



On 4/11/19, 1:21 PM, "clamav-users on behalf of David Hendrick" 
<clamav-users-boun...@lists.clamav.net on behalf of 
david.hendr...@meetingsbooker.com> wrote:

    Hi there,
    Does anyone know if there's a way to have ClamAV detect PDF files that have
    items such as "OpenAction" or "JavaScript" or "JS"?
    
    Thanks,
    David
    
    
    _______________________________________________
    
    clamav-users mailing list
    clamav-users@lists.clamav.net
    https://lists.clamav.net/mailman/listinfo/clamav-users
    
    
    Help us build a comprehensive ClamAV guide:
    https://github.com/vrtadmin/clamav-faq
    
    http://www.clamav.net/contact.html#ml
    


_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to