I emailed the paper author above, he said no, not suitable for the NLP task, 
only will work on image data (hence GPT-2 is agnostic/general):



"Thanks for your interest. The key novelty in I-GPT is the transformer, which 
utilizes multiple head attention models to get global information for scene/NLP 
understanding.
 


Hence, they can generate correct content for arbitrary inputs. However, their 
model needs expensive computing costs and high memory space as the transformer 
store the global relationship of each key.


 


Our model utilizes the CNN structure, the attention model is only used in one 
layer for copying information from visible regions. I don’t think this 
CNN-Based structure is suitable for the NLP task."



------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T48eb73fe225c230b-M029f284fb93f6c8e24efb17d
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to